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Abstract 

Existing  theory  yields  useful  performance  criteria  and  processing  techniques  for  acous¬ 
tic  pressure-sensor  arrays.  Acoustic  vector-sensor  arrays,  which  measure  particle  ve¬ 
locity  and  pressure,  offer  signihcant  potential  but  require  fundamental  changes  to 
algorithms  and  performance  assessment. 

This  thesis  develops  new  analysis  and  processing  techniques  for  acoustic  vector¬ 
sensor  arrays.  First,  the  thesis  establishes  performance  metrics  suitable  for  vector¬ 
sensor  processing.  Two  novel  performance  bounds  dehne  optimality  and  explore  the 
limits  of  vector-sensor  capabilities.  Second,  the  thesis  designs  non-adaptive  array 
weights  that  perform  well  when  interference  is  weak.  Obtained  using  convex  op¬ 
timization,  these  weights  substantially  improve  conventional  processing  and  remain 
robust  to  modeling  errors.  Third,  the  thesis  develops  subspace  techniques  that  enable 
near-optimal  adaptive  processing.  Subspace  processing  reduces  the  problem  dimen¬ 
sion,  improving  convergence  or  shortening  training  time. 
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Chapter  1 


Introduction 


Because  they  are  often  reliable,  easy  to  analyze,  and  straightforward  to  process,  pres¬ 
sure  sensor  arrays  have  dominated  sonar  for  decades.  Recent  advances  in  sensor  qual¬ 
ity  and  miniaturization  have  stirred  interest  in  more  complex  devices,  those  that  mea¬ 
sure  velocity  or  acceleration  in  addition  to  pressure.  Each  of  these  “vector-sensors” 
provides  several  measurements,  offering  signihcant  potential  and  fresh  challenges. 
Examining  the  use  of  vector-sensor  arrays  in  passive  sonar  reveals  the  promise  such 
arrays  offer  to  the  held  of  undersea  surveillance.^ 

1.1  Passive  Sonar  Background 

The  principles  that  have  historically  driven  passive  sonar  research  are  the  same  ones 
behind  this  work.  Therefore,  understanding  the  motivation  for  this  research  requires 
some  background  in  passive  sonar.  This  section  provides  a  brief  introduction  to 
passive  sonar  and  pressure- sensor  arrays. 

Passive  sonar,  which  quietly  listens  for  emitted  sound,  is  effective  at  undersea 
surveillance  for  three  reasons.  First,  sonar  operates  over  great  distances.  Sound 
waves  sometimes  travel  thousands  of  miles  underwater.  Electromagnetic  waves,  by 
contrast,  generally  travel  short  distances  in  saltwater  before  being  absorbed.  Second, 
emitted  sound  is  exploitable.  Machinery  produces  characteristic  sounds  which  aid  in 
^This  background  material  on  sonar  and  vector-sensors  is  also  covered  in  [1]. 
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the  detection,  localization,  and  classification  of  vessels.  Third,  passive  sonar  is  covert. 
Passive  sonar  systems  are  difficult  to  detect  because  they  emit  no  energy.  Active 
sonar,  by  contrast,  emits  energy  which  adversaries  could  use  for  counter-detection 
and  avoidance. 

The  most  common  sensor  employed  for  sonar  is  the  hydrophone.  A  hydrophone 
measures  only  pressure,  essentially  forming  an  underwater  microphone.  Sound  waves 
passing  over  a  hydrophone  introduce  changes  in  pressure  that  are  measured  and  used 
for  detection.  Omnidirectional  hydrophones  are  common  because  their  construc¬ 
tion,  maintenance,  and  analysis  is  well-understood.  Decades  of  experience  with  hy¬ 
drophones  show  they  survive  well  in  the  corrosive  ocean  environment  and  are  effective 
when  assembled  into  arrays. 

The  most  common  array  configuration  is  the  uniformly  spaced  linear  array.  Linear 
arrays  are  often  fixed  to  the  side  of  a  ship,  mounted  on  the  sea  floor,  or  towed  behind 
a  moving  vessel.  When  a  vessel  travels  in  a  straight  line,  drag  pulls  a  towed  array  into 
an  approximately  linear  shape.  The  exact  location  and  orientation  of  each  sensor  is 
usually  unknown  or  subject  to  modeling  errors. 


1.2  Acoustic  Vector-Sensors 

Increasing  the  information  measured  by  a  sensor  generally  improves  its  performance. 
With  acoustic  measurements,  particle  velocity  provides  additional  information  about 
the  direction  of  sound  arrival.  Acoustic  vector-sensors  each  contain  one  omnidirec¬ 
tional  hydrophone  measuring  pressure  and  three  orthogonal  geophones  measuring  the 
components  of  particle  velocity.^  Figure  1.2.1  illustrates  a  three-dimensional  vector¬ 
sensor.  The  geophone  in  the  figure  contains  a  suspended  coil  which  slides  along  the 
axis  of  a  fixed  magnet.  Sound  passing  along  the  axis  of  the  geophone  vibrates  the 
coil  and  induces  a  current.  The  induced  current  provides  a  measurement  of  velocity 
component  along  the  geophone  axis. 

^Although  velocity  sensors  are  common,  many  vector-sensors  equivalently  use  accelerometers, 
directional  hydrophones,  or  pressure-gradient  sensors. 
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Vector-Sensor  Geophone 


Figure  1.2.1:  Notional  diagram  of  a  vector-sensor 

Although  geological  vector-sensors  have  existed  for  decades,  recent  advances  in 
geophone  design  have  increased  their  utility  for  sonar  applications.  Because  vector- 
sensors  include  directional  information,  they  have  the  potential  to  improve  the  per¬ 
formance  of  passive  sonar  systems. 

1.3  Vector-Sensor  Processing 

Vector-sensor  measurements  provide  more  information  than  pressure-sensor  measure¬ 
ments.  Using  this  additional  information  to  improve  performance  is  the  role  of  vector¬ 
sensor  processing.  This  subsection  illustrates  the  primary  benefit  of  vector-sensor  pro¬ 
cessing:  resolving  ambiguous  pressure-sensor  measurements.  Similar,  more  detailed 
analyses  are  provided  in  [1,  2,  3,  4]. 

The  benefit  of  vector-sensors  is  first  apparent  when  comparing  a  three-dimensional 
vector-sensor  to  an  omnidirectional  pressure-sensor.  By  definition,  the  response  of 
the  omnidirectional  pressure  sensor  is  equal  in  all  directions.  But  because  the  vector¬ 
sensor  also  measures  particle  velocity,  a  three-dimensional  vector,  it  yields  information 
about  the  direction  of  a  sound  source.  Put  another  way,  all  directions  are  ambiguous 
to  the  pres  sure- sens  or,  but  no  directions  are  ambiguous  to  the  vector-sensor.  This  lack 
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Figure  1.3.1:  Example  vector-sensor  response  patterns 

of  ambiguity  means  a  single  vector-sensor  is  inherently  directional.  Vector-sensors  are 
also  tunable:  linear  combinations  of  the  four  elements  forms  a  “psendo-sensor”  with 
many  different  response  patterns  [3].  A  few  of  these  patterns  are  shown  in  Figure 
1.3.1.  By  choosing  appropriate  weights,  these  patterns  are  easily  rotated  to  emphasize 
or  nnll  any  arbitrary  direction. 

The  same  behavior  extends  to  arrays  of  vector-sensors.  Compare  a  nniformly 
spaced  linear  array  composed  of  N  vector-sensors  to  one  composed  of  N  omnidirec¬ 
tional  pressnre-sensors.  Example  directional  responses  or  “beampatterns”  for  both 
array  types  are  shown  in  Figure  1.3.2.  Both  arrays  have  V  =  10  elements  at  frequency 
/  =  5/7 fd,  where  fd  is  the  design  freqnency  (the  frequency  at  which  the  inter-element 
spacing  is  one-half  wavelength).  By  choosing  weights  and  linearly  combining  array 
elements,  the  top  and  bottom  plots  are  “steered”  to  n/2  and  — 7r/4,  respectively.  The 
response  of  a  linear  pressnre-sensor  array  (PSA)  is  symmetric  abont  rotation  aronnd 
the  array  axis.  This  is  evident  in  the  symmetric  PSA  beampattern:  arrivals  from  op- 
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Figure  1.3.2:  Example  VS  A  and  PSA  response  patterns 
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posite  sides  of  the  array  yield  identical  pressure  measurements.  Changing  the  weights 
applied  to  each  element  alters  the  beampattern,  but  the  response  is  always  symmet¬ 
ric.  The  PSA  beampattern  always  exhibits  an  ambiguous  peak,  or  “backlobe,”  in 
the  direction  opposite  the  desired  steering  angle.  In  contrast,  the  vector-sensor  array 
(VSA)  utilizes  unambiguous  measurements  from  each  sensor  to  reduce  the  backlobe. 
The  level  of  backlobe  reduction  is  determined  by  the  choice  of  weights.  In  the  top  plot 
of  Figure  1.3.2,  the  VSA  backlobe  is  driven  near  zero;  in  the  bottom  plot  of  Figure 
1.3.2,  it  is  reduced  by  6  dB. 

Directional  information  makes  VSA  processing  fundamentally  different  from  PSA 
processing.  Pressure-sensor  processing  exploits  phase  or  time-delay  measurements  to 
resolve  signals  and  reject  noise.  Vector-sensors  provide  little  additional  phase  infor¬ 
mation  because  the  sensor  components  are  co-located;  the  directional  components 
yield  mostly  gain  information.  VSA  processing  must  exploit  both  gain  and  phase 
measurements  to  be  effective. 


1.4  Problem  Statement 

The  additional  measurements  provided  by  vector-sensor  arrays  offer  benehts  and  chal¬ 
lenges.  As  the  previous  section  shows,  vector-sensor  arrays  are  more  versatile  than 
arrays  of  only  pressure-sensors.  Exploiting  this  versatility  raises  a  number  of  ques¬ 
tions  addressed  in  this  work.  These  questions  fall  into  two  broad  categories  that  serve 
to  organize  the  research: 

1.  How  well  can  a  vector-sensor  array  do?  How  can  the  vector-sensor  array 
“performance  improvement”  be  quantihed?  By  how  much  can  vector-sensors 
potentially  improve  performance? 

2.  How  can  a  vector-sensor  array  do  well?  How  can  vector-sensor  arrays 
achieve  good  performance  in  practice?  Without  a  priori  information,  what 
vector-sensor  processing  is  best?  How  can  vector-sensor  processing  adapt  to 
incorporate  new  data?  How  can  the  computational  cost  required  to  process 
vector-sensor  arrays  be  reduced? 
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Research  suggests  that  existing  results  do  not  resolve  these  questions.  Although 
vector-sensor  array  processing  seems  to  be  a  straightforward  extension  to  pressure¬ 
sensor  array  processing,  it  requires  fundamental  changes  to  analyses  and  algorithms. 

The  remainder  of  this  section  highlights  the  difhcnlty  of  answering  the  above 
questions  with  current  approaches.  Qnestions  in  the  hrst  category  reqnire  metrics 
and  bonnds  to  qnantify  vector-sensor  performance.  Questions  in  the  second  category 
subdivide  according  to  the  two  helds  of  array  processing.  The  hrst  held  is  nonadaptive 
processing,  where  the  sensors  are  combined  linearly  using  hxed  weights.  The  second 
held  is  adaptive  processing,  where  weights  are  allowed  to  change  based  npon  observed 
data. 

1.4.1  Performance  Metrics  and  Limits 

Two  performance  dimensions  commonly  used  to  quantify  and  bonnd  improvements  in 
array  processing  are  array  resolution  and  gain/directivity.  The  next  two  paragraphs 
briehy  show  that  VS  A  improvements  are  not  expressed  along  these  performance  di¬ 
mensions. 

Either  beamwidth  or  angle  estimation  error  is  typically  used  to  quantify  array  reso¬ 
lution.  Figure  1.3.2  reveals  that  the  VSA  and  PSA  beamwidths  are  almost  exactly  the 
same.  A  wide  class  of  beampatterns,  including  those  in  [3,  4],  relies  on  pattern  mnlti- 
plication  (see  [5,  §2.8]).  The  width  of  such  beampatterns  is  virtually  unchanged  from 
that  of  a  pressure-sensor  array  [1,  §2.1].  Another  metric  that  qnantihes  array  resolu¬ 
tion  is  the  root  mean  squared  error  (RMSE)  resulting  from  direction-of-arrival  (DOA) 
estimation.  Improving  array  resolution  lowers  the  RMSE.  Bounds  on  the  RMSE  are 
often  derived  and  compared  to  the  actual  error  of  common  direction-of-arrival  algo¬ 
rithms;  for  vector-sensor  arrays  this  analysis  appears  in  [1,  2,  4,  6].  Representative 
plots  are  shown  in  Figure  1.4.1,  derived  from  [1,  §3.2]  for  the  case  of  a  single  nar¬ 
rowband  sonrce  in  white  noise  and  a  V  =  13  element,  mismatched,  linear  VSA.  The 
moderate  mismatch  scenario  includes  zero-mean  Gaussian  position,  rotation,  gain, 
and  phase  errors;  the  direction-of-arrival  algorithm  is  a  conventional  beamscan  tech¬ 
nique.  A  detailed  description  of  the  parameters,  algorithm,  and  bound  is  in  [1].  Two 
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Figure  1.4.1:  VSA  vs.  PSA  resolution,  measured  by  direction-of-arrival  error 
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observations  are  clear  from  the  figure:  1)  the  actual  algorithm  RMSE  does  not  de¬ 
crease  signihcantly  with  a  vector-sensor  array,  and  2)  the  lower  bound  indicates  that 
only  a  modest  improvement  is  possible.  A  key  consideration  is  that  vector-sensors  in¬ 
crease  the  number  of  measurements  from  N  to  4A^,  but  simply  increasing  the  number 
of  pressure-sensors  to  4A^  (keeping  the  same  inter-element  spacing)  yields  a  smaller 
beamwidth  and  a  lower  RMSE.  Unlike  increasing  the  number  of  pressure-sensors,  how¬ 
ever,  using  vector-sensors  achieves  improvement  without  altering  the  physical  length 
of  an  array. 

Vector-sensor  arrays  evidently  do  not  substantially  improve  resolution,  but  re¬ 
search  further  reveals  that  VSAs  do  not  improve  directivity  more  than  PSAs  with 
comparable  numbers  of  components.  The  directivity  of  vector-sensor  arrays,  pre¬ 
sented  in  [7],  is  at  most  6  dB  higher  than  pressure-sensor  arrays.  As  with  array 
resolution  this  improvement  is  no  better  than  that  achieved  by  simply  increasing  the 
number  of  pressure- sensors  from  N  to  4N. 

Because  the  benehts  of  vector-sensors  are  not  reflected  in  measures  such  as  resolu¬ 
tion  or  directivity  (considering  the  increased  number  of  components),  new  measures 
are  necessary  to  quantify  VSA  performance.  Although  existing  bounds  are  useful  for 
analyzing  vector-sensor  array  conhguration  [6]  and  robustness  [1],  alternative  bounds 
are  required  to  understand  how  much  improvement  VSAs  offer  along  the  new  perfor¬ 
mance  dimensions. 

1.4.2  Nonadaptive  VSA  Processing 

Some  of  the  most  powerful  nonadaptive  processing  techniques  become  difficult  or 
impossible  with  vector-sensor  arrays.  Designing  hxed  weights  for  nonadaptive  pro¬ 
cessing  involves  multiple  objectives.  Three  of  the  most  useful  objectives  are  narrow 
beamwidth,  low  sidelobe  level,  and  low  sensitivity  to  modeling  errors.  Analytical 
methods  enable  joint  optimization  of  the  PSA  beamwidth  and  sidelobe  level,  but 
these  methods  do  not  apply  to  VSAs.  As  a  result,  VSA  beampatterns  are  often 
designed  using  alternative  criteria. 
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Many  existing  approaches  are  similar  to  [3,  4]  and  effectively  choose  weights  to 
maximize  gain  against  some  postulated  noise  field  (see  Section  4.2).  One  formulation 
of  this  problem  is  the  mathematical  program 


minimize  w^Rw 
subject  to  w^vq  =  1 


(1.4.1) 


for  some  postulated  covariance  matrix  R  and  signal  replica  vq.  Choices  for  the  pos¬ 
tulated  covariance  matrix  are  often  combinations  of  isotropic  noise,  white  noise,  and 
point  sources.  The  resulting  weights  may  have  a  simple  closed  form,  and  pattern 
multiplication  may  allow  for  spatial  tapering.  For  instance,  choosing 


R  =  V 


iVf  + 


(1.4.2) 


with  Vf,  being  a  signal  replica  directed  at  the  backlobe,  gives  “point  null”  beampat- 
terns  as  shown  in  the  top  plot  of  Figure  1.4.2.  Applying  a  25  dB  Taylor  spatial  taper 
([5,  §3.4.3])  to  the  weights  yields  the  beampatterns  shown  in  the  bottom  plot. 

For  a  vector-sensor  array,  optimizing  the  important  objectives  of  narrow  beamwidth, 
low  sidelobe  level,  and  low  sensitivity  requires  new  techniques.  Because  existing  tech¬ 
niques  do  not  explicitly  optimize  over  these  objectives,  the  resulting  weights  are  sub- 
optimal  with  respect  to  each  objective.  For  instance,  the  beampatterns  in  Figure 
1.4.2  leave  room  for  improvements  in  mainlobe  width,  sidelobe  level,  and  robustness. 
Techniques  that  optimize  these  objectives  are  widely  used  for  PSA  processing,  so 
equivalent  techniques  for  VSA  processing  are  important  to  develop. 


1.4.3  Adaptive  VSA  Processing 

A  key  problem  in  adaptive  vector-sensor  array  processing  is  high  dimensionality. 
Vector-sensor  array  data  is  four  times  the  dimension  of  pressure-sensor  array  data 
because  of  the  additional  velocity  measurements.  This  high  dimension  complicates 
adaptive  processing  in  two  ways. 
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First,  it  makes  parameter  estimation  more  difficult.  Adaptive  processing  often 
requires  estimating  the  second-order  moments  of  the  data,  i.e.  the  covariance  matrix. 
Logic  similar  to  [8]  quickly  reveals  the  scope  of  this  problem.  The  number  of  indepen¬ 
dent  observations,  or  “snapshots,”  available  is  determined  by  the  stationarity  of  the 
environment  and  the  length  of  the  array.  The  environment  is  effectively  stationary  if 
sources  move  less  than  a  beamwidth  during  observation.  The  broadside  beamwidths 
of  vector  and  pressure-sensor  arrays  are  almost  the  same,  A0  ^  2/A.  Recall  that  A 
is  the  number  of  vector-sensors;  the  total  number  of  measurements  is  4A  for  a  3-D 
vector-sensor  array.  For  an  array  of  length  L  at  the  design  frequency  with  wavelength 
A,  A  =  2L/X.  The  worst  case  (shortest)  stationarity  time  is  then  given  by  a  broadside 
source  at  range  R  moving  tangentially  at  speed  v: 

ATstat  -  Ae-R/v 

=  XR/{Lv).  (1.4.3) 

The  time  required  to  preserve  accurate  phase  estimates  for  a  single  snapshot  is  ap¬ 
proximately  8  X  the  maximum  travel  time  of  sound  waves  across  the  array.  This  travel 
time  is  longest  at  array  endhre,  where  it  is 

ATsnap~8L/c  (1-4.4) 

for  sound  wave  propagation  speed  c.  The  approximate  number  of  snapshots  available, 
Aavaii,  is  then 


Aavail  ^  ATstat/AR 


snap 


=  \cR/(SvL'^) 


(1.4.5) 


The  number  of  snapshots  desired  is  determined  by  the  dimension  of  the  sample  co- 
variance  matrix;  a  common  rule-of-thumb  is  to  use  more  than  two  or  three  times  the 
dimension  for  a  well-estimated  matrix  [9].  Using  fewer  snapshots  produces  weights 
that  are  not  robust  and  are  sensitive  to  noise.  Assuming  an  optimistic  factor  of  two 


24 


times  the  data  dimension,  the  nnmber  of  snapshots  desired,  K^es,  for  a  VSA  and  PSA 
are 


A'des.PSA  ~  2  ■  N 

=  4L/X  (1.4.6) 

A’des,VSA  ~  2  ■  4A^ 

=  16L/A.  (1.4.7) 


A  typical  scenario  with  i?  =  10  km,  n  =  20  knots,  and  /  =  200  Hz  yields  the  cnrves 
shown  in  Figure  1.4.3.  These  curves  illustrate  a  fundamental  adaptive  processing 
problem;  the  number  of  available  snapshots  is  usually  far  fewer  than  desired.  The 
problem  is  worse  for  vector-sensor  arrays  because  of  the  higher  dimension.  As  indi¬ 
cated  on  the  plot,  covariance  matrices  are  poorly  estimated  for  vector-sensor  arrays 
longer  than  about  11. 5A,  or  A^  >  23.  The  same  problem  exists  for  pressure-sensor  ar- 
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rays,  but  at  longer  lengths  (18. 2A,  or  N  >  36).  Adaptive  VSA  processing  techniques 
must  combat  this  snapshot  dehciency  to  be  effective  in  practice. 

High  dimensional  vector-sensor  data  complicates  adaptive  processing  a  second  way 
by  increasing  the  computational  requirements.  A  typical  adaptive  processing  opera¬ 
tion,  the  singular  value  decomposition  of  a  covariance  matrix,  generally  takes  0(A^^) 
floating  point  operations.  A  vector-sensor  array,  then,  increases  the  computational 
burden  by  a  factor  of  roughly  4^  =  64. 

Current  approaches  for  high-dimensional  adaptive  array  processing  fall  into  three 
categories.  The  first  category  augments  the  covariance  matrix  to  make  it  well- 
conditioned.  Fixed  and  variable  diagonal  loading  as  covered  in  [10]  and  [11]  take  this 
approach.  The  second  category  performs  adaptive  beamforming  in  reduced-dimension 
linear  subspaces.  Many  techniques  fall  in  this  category,  including  some  suggested  in 
[4] .  The  third  category  utilizes  additional  information  to  improve  the  covariance  ma¬ 
trix  estimate.  One  such  technique  is  “Physically  Constrained  Maximum-Likelihood 
(PCML)”  estimation  [12].  The  problem  of  high-dimension  is  more  pronounced  with 
vector-sensor  arrays,  so  existing  and  new  techniques  must  be  closely  examined. 

1.5  Key  Findings 

This  thesis  includes  several  novel  contributions  to  the  field  of  array  processing.  It 
establishes  the  limits  of  VSA  performance  and  describes  practical  techniques  that 
approach  these  limits.  Organized  by  chapter,  the  most  significant  contributions  are: 

Ch  2:  A  thorough  exploration  of  vector-sensor  array  fundamentals.  One 

key  finding  in  this  area  is  that  many  useful  properties  are  not  exhibited  by 
vector-sensor  arrays.  Another  key  finding  is  a  real  expression  for  the  VSA 
beampattern. 

Ch  3:  Two  performance  bounds  on  a  critical  VSA  capability:  resolving 
pressure  ambiguities.  These  bounds  relate  ambiguity  resolution  to  the  com¬ 
mon  problems  of  detection  and  estimation.  Key  findings  include  showing  that 
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1)  the  bounds,  although  fundamentally  different,  both  depend  on  the  same  sim¬ 
ple  quantity,  and  2)  good  performance  is  theoretically  possible  in  most  cases. 

Ch  4:  The  design  of  robust,  fixed  weights  with  excellent  performance  char¬ 
acteristics.  Key  hndings  include  the  “Minimum  Sensitivity”  criterion,  an  al¬ 
gorithm  for  designing  robust  weights,  and  a  demonstration  of  improved  perfor¬ 
mance. 

Ch  5:  The  derivation  of  optimum  subspaces  that  enable  or  improve  adap¬ 
tive  processing.  Key  hndings  include  1)  the  optimality  criterion  of  “inner 
product  preservation,”  2)  the  derivation  of  eigenbeam  subspaces  as  least-squares 
designs,  and  3)  a  demonstration  of  signihcant  performance  improvement. 

Several  of  the  contributions  listed  above  are  summarized  at  the  end  of  the  thesis  in 

Section  6.1,  Figure  6.1.1. 


1.6  Sensor  and  Environment  Model 

This  entire  document  assumes  the  same  sensor  and  environment  model  to  simplify 
discussion.^  Each  section  explicitly  notes  any  departures  from  or  extensions  to  this 
common  model.  The  subsequent  analysis  assumes  the  following  sensor  model: 

1.  Co-located  sensor  components.  The  hydrophone  and  three  geophones  of  each 
vector-sensor  are  located  at  the  same  point  and  observing  the  same  state.  In 
practice,  this  requires  the  component  spacing  to  be  small  compared  with  the 
minimum  wavelength. 

2.  Point  sensors.  Each  vector-sensor  is  modeled  as  a  single  point.  In  practice, 
this  requires  the  sensor  dimensions  to  be  small  compared  with  the  minimum 
wavelength. 

^The  same  sensor,  environment,  and  plane  wave  models  are  also  covered  in  [1]. 
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3.  Geophones  with  cosine  response.  The  signal  response  of  each  geophone  is  pro¬ 
portional  to  the  cosine  of  the  angle  between  the  geophone  axis  and  the  source. 
Cosine  geophone  response  results  from  measuring  velocity  along  only  one  axis. 

4.  Orthogonal  geophones.  The  axes  of  the  three  geophones  are  orthogonal.  This  is 
true  in  practice  when  each  vector-sensor  is  a  static  unit. 

The  thesis  also  assumes  the  following  environment  model; 

1.  Free-space  environment.  Sound  waves  travel  in  a  quiescent,  homogeneous, 
isotropic  fluid  wholespace.  This  implies  direct-path  propagation  only. 

2.  Narrowband  signals.  The  signal  is  analyzed  at  a  single  frequency.  This  means 
the  signal  is  sufficiently  band-limited  to  allow  narrowband  processing  in  the 
frequency  domain.  Passive  sonar  systems  typically  operate  over  a  wide,  many- 
octave  bandwidth;  narrowband  signals  may  be  obtained  by  computing  the  dis¬ 
crete  Fourier  transform  of  the  measurements  and  processing  each  frequency  bin. 

3.  Plane  wave  propagation.  The  sound  waves  are  planar  at  each  sensor  and  across 
the  array.  This  implies  the  unit  vector  from  each  sensor  to  the  source  is  the 
same,  regardless  of  the  sensor  location.  Sound  waves  are  approximately  planar 
when  the  source  is  beyond  the  Fresnel  range  [8]. 

The  underlying  assumptions  and  notation  are  similar  to  those  in  [2,  6,  13]  although 
this  document  has  a  different  objective. 

1.7  Plane  Wave  Measurement  Model 

Under  the  assumptions  in  Section  1.6,  consider  a  plane  wave  parameterized  by  az¬ 
imuth  (j)  G  [0,  27r)  and  elevation  'ijj  G  [— 7r/2,  7r/2]  impinging  on  an  array  of  N  vector 
sensors.  The  remainder  of  the  thesis  assumes  a  right-handed  coordinate  system  with 
0  =  0  as  forward  endfire,  0  =  7r/2  as  port  broadside,  '^  =  0  as  zero  elevation,  and 
V’  =  7r/2  as  upward.  The  parameters  0  and  ip  are  sometimes  grouped  into  the  vector 
©  for  notational  convenience.  Without  loss  of  generality,  assume  the  geophone  axes 


are  parallel  to  the  axes  of  the  coordinate  system.  If  this  is  not  the  case,  the  data  from 
each  vector  sensor  can  be  rotated  to  match  the  coordinate  axes.  The  unit  vector, 

u  =  [cos^cos'^,  sin0cos'^,  siii'^]'^,  (1-7-1) 


points  from  the  origin  to  the  source  (or,  opposite  the  direction  of  the  wave  propaga¬ 
tion).  The  following  derivations  touch  only  briefly  on  direct-path  acoustic  propaga¬ 
tion.  For  a  much  more  detailed  study  of  ocean  acoustics,  see  [14]. 


Under  common  conditions,  the  components  of  velocity  relate  linearly  to  pressure. 
Assuming  an  inviscid  homogeneous  fluid,  the  Navier-Stokes  equations  become  the 
Euler  equations 


(9v 


v^Vv 


Vp 

P 


(1.7.2) 


where  v  is  fluid  velocity,  p  is  density,  and  p  is  pressure.  For  acoustic  propagation 
this  equation  is  linearized,  neglecting  the  convective  acceleration  term  v^Vv.  With 
a  plane  wave,  the  pressure  p  relates  across  time  t  and  position  x  through  the  sound 
speed  c: 


p(x,t)  = 

/( 

^u^x  \ 

-h  t  j 

V  c  / 

(1.7.3) 

'.  Vp  = 

u 

c 

dp 

'  m' 

(1.7.4) 

Substituting  Equation  1.7.4  into  the  Euler  equations  in  1.7.2  shows  that  under  weak 
initial  conditions  the  pressure  and  fluid  velocity  obey  the  plane  wave  impedance 
relation 

w  =  -—p.  (1.7.5) 

pc 

Because  the  geophones  are  aligned  with  the  coordinate  axes,  they  simply  measure  the 
components  of  the  velocity  vector  v.  The  resulting  linear  relationship  between  the 
pressure  and  each  component  of  the  fluid  velocity  greatly  simplihes  the  analysis  of 
vector-sensor  array  performance. 

The  linear  relationship  in  Equation  1.7.5  enables  expressing  the  velocity  measure- 
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Figure  1.7.1:  Vector-sensor  measurements,  scaled  to  common  units 


ments  in  terms  of  pressure  and  the  source  unit  vector.  Returning  to  the  array  of  N 
vector-sensors,  the  plane  wave  measurement  of  the  nth  vector-sensor  in  phasor  form 

is 


1 

— u/ pc 


(1.7.6) 


where  r„  is  the  position  of  the  sensor  and 
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T 


(1.7.7) 


is  the  wavenumber  for  a  wavelength  A.  The  term  outside  the  vector  is  the  wave  phase 
delay,  which  factors  out  because  of  Equation  1.7.5.  Only  the  gain  difference  between 
the  pressure  sensors  and  geophones  is  important.  For  convenience,  this  thesis  chooses 
a  normalization  that  absorbs  that  gain  difference  into  the  pressure  term: 


gjfco(rju) 


U 


(1.7.8) 


Although  this  choice  of  normalization  seems  arbitrary,  it  results  in  simpler  expressions 
later  and  is  similar  to  the  notation  used  in  [2,  6,  13].  ^  Also  note  that  this  choice  of 
normalization  requires  a  factor  of  (pc)“^  when  comparing  beam  estimates  in  units  of 
absolute  power. 

^The  ry  defined  here  is  not  exactly  the  same  as  the  one  used  in  [2,  6,  13]. 


30 


The  remainder  of  this  thesis  uses  the  gain  factor  rj  =  1  in  all  derivations  unless 
otherwise  stated.  In  most  cases,  the  results  are  easily  extended  to  arbitrary  77,  some¬ 
times  by  inspection.  The  choice  of  77  =  1  for  analysis  has  an  important  practical 
motivation  involving  the  trade-off  between  left/right  resolution  and  white  noise  array 
gain  or  sensitivity.  Although  the  pressure-sensor  often  has  higher  gain  (77  >  1)  for 
actual  vector-sensor  arrays,  the  array  data  is  easily  normalized  to  common  units  as 
shown  in  Figure  1.7.1.  Normalizing  the  units  produces  two  results:  1)  a  slight  loss 
of  array  gain  because  of  the  increased  geophone  noise,  and  2)  an  improved  ability  to 
resolve  ambiguities.  Vector-sensor  arrays  are  generally  chosen  for  their  ambiguity  res¬ 
olution,  so  (2)  takes  precedent.  Put  another  way,  using  the  77  =  1  data  normalization 
strengthens  ambiguity  resolution  at  the  possible  expense  of  white  noise  gain. 
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Chapter  2 


Elements  of  Vector- Sensor  Array 
Processing 


Any  field,  no  matter  how  well  organized,  possesses  a  set  of  fnndamentals  that  must 
be  understood  before  moving  into  advanced  topics.  The  held  of  acoustic  vector¬ 
sensor  array  processing  is  built  from  elements  common  to  all  array  processing  and 
elements  specihc  to  vector-sensor  arrays.  The  source  [5]  reviews  the  former;  this 
chapter  introduces  some  of  the  latter.  Most  of  the  concepts  introduced  in  this  chapter 
are  new  to  the  literature  and,  although  simple,  have  profound  consequences. 


2.1  Vector- Sensor  Array  Beampattern 

One  of  the  most  fundamental  differences  between  vector-sensor  arrays  and  pressure¬ 
sensor  arrays  is  the  structure  of  the  beampattern.  Building  on  the  plane  wave  mea¬ 
surement  model  provided  in  Section  1.7,  this  section  provides  an  expression  for  the 
beampattern  of  an  arbitrary  vector-sensor  array  with  arbitrary  element  weighting.  It 
then  simplihes  this  expression  for  the  case  of  a  uniform  linear  vector-sensor  array.  The 
symmetry  of  the  uniform  linear  array  leads  to  the  use  of  conjugate  symmetric  weights, 
a  real  beampattern,  and  a  reflection  symmetry  relating  beams  in  one  quadrant  to  the 
other  three  quadrants. 
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2.1.1  General  Complex  Beampattern  Expression 


The  most  general  case  for  a  vector-sensor  beampattern  is  an  arbitrary  array  with  an 
arbitrary,  complex  element  weighting.  Beginning  with  the  measurement  of  a  single 
vector-sensor  in  Equation  1.7.8,  the  beampattern  of  an  iV-element  vector-sensor  array 
is  the  weighted  sum 

N 


n=l 


(2,1,1) 


where 


v„(©)  = 


1 

u(©) 


(2.1.2) 


is  the  measurement  of  the  vector-sensor  and  are  the  weights.  Recall  from 
Section  1.7  that  r„  is  the  position  of  the  vector-sensor,  ko  is  the  wavenumber, 
and  u(©)  is  the  unit  vector  directed  toward  ©.  Without  knowledge  of  any  sensor 
positions  or  constraints  on  the  weights.  Equation  2.1.1  cannot  be  simplihed  further. 
It  is  generally  a  complex  valued  expression  that  is  difficult  to  analyze  partly  because 
the  unit  vector,  u,  appears  both  inside  and  outside  the  complex  exponential.  The 
beampattern  at  any  point  is  a  linear  combination  of  the  weights,  so  dehning  the  array 
measurement  and  weight  vectors. 


v(©)  ^  [vf(©)  v^(©)  ■■■  V^(©)  (2.1.3) 

w  =  [wf  ■■■  wjf  ,  (2.1.4) 

enables  writing  Equation  2.1.1  as  a  compact  inner  product: 

!/(©)=  w%(©).  (2.1.5) 


Sampling  the  beampattern  at  a  set  of  M  points,  {©i,  ©2, . . . ,  ©vf},  corresponds  to 
the  linear  transformation 

y*  =  V^w,  (2.1.6) 
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with 


v(©i)  v(©2) 

v(©m)  ] 

(2.1.7) 

l/(©i)  y{&2) 

y{&M)  • 

(2.1.8) 

This  linear  transformation  is  valid  for  any  arbitrary  vector-sensor  array;  its  real  coun¬ 
terpart  is  derived  in  the  next  section  and  forms  the  foundation  of  beampattern  design 
via  convex  optimization  in  Chapter  4. 


2.1.2  Simplifications  Exploiting  Symmetry 


For  a  linear  vector-sensor  array  with  elements  symmetric  about  the  origin,  a  series  of 
simplihcations  to  Equation  2.1.1  is  possible.  These  simplifications  allow  1)  conjugate 
symmetry  that  reduces  the  number  of  variables  from  8N  to  3N  and  2)  quadrant 
symmetry  that  reduces  the  design  burden  by  a  factor  of  four.  This  thesis  discusses 
signals  in  the  x-y  plane,  but  the  results  extend  easily  to  the  3-D  case.  When  dealing 
with  signals  in  the  horizontal  plane,  the  vertical  geophone  contributes  nothing  and  is 
ignored.  Because  the  array  is  linear  and  the  position  and  direction  vectors  are  in  the 
horizontal  plane, 

/cor^u(©)  =  d^kQCOscj)  (2.1.9) 

where  dn  is  the  position  of  the  element  along  the  array.  Ignoring  the  vertical  geophone, 
the  measurement  vector  of  a  single  vector-sensor  is 


V„(0)  = 


1 

COS  (f) 

sincj) 


Writing  each  weight  vector  in  terms  of  magnitude  and  phase  gives 


(2.1.10) 


w. 


^  ■ 


Cln^ 


JOLn 


Cr)  6' 


Jin 


nT 


(2.1.11) 
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where  a^,  q;„,  etc.  are  real.  Substituting  Equations  2.1.10  and  2.1.11  into  Equation 
2.1.1  yields 


1/(0) 


E 

n=l 


Q  ^jid„ko  cos  4>-an) 


(2.1.12) 


Because  the  element  spacing  is  symmetric  about  the  array  center,  1)  the  vectors  v(0) 
are  conjugate  symmetric  and  2)  most  problems  involve  only  conjugate  symmetric 
weights  (see  Appendix  A. 2).  The  full-length  (conjugate  symmetric)  weight  vector,  w, 
is  fully  characterized  by  a  half-length  weight  vector,  w.  Assuming  an  even  number  of 
elements,  the  parameterization  is 


di  —  cIl+i 


W; 


A 


aie 


w 


n 


^n-L  n>  L 
^l-n+1  n<L 


(2.1.13) 

(2.1.14) 

(2.1.15) 

(2.1.16) 


for  real  variables  ai,  ai,  etc..  The  beampattern  in  Equation  2.1.12  becomes  a  real 
function  when  the  weights  are  conjugate  symmetric: 


L 

=  E  2a/  cos{diko  cos  (p  —  ai) 

1=1 

+  2bi  cos{diko  cos  (p  —  (3i)  cos  (p 
+  2ci  cos{diko  cos  (p  —  7/)  sin  cp. 


(2.1.17) 


Note  the  similarity  between  the  derivation  above  and  the  steps  involved  in  FIR  hlter 
design;  this  aspect  of  the  vector-sensor  beampattern  is  explored  more  in  Section  2.3. 
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Using  trigonometric  identities,  Equation  2.1.17  simplifies  further: 


1/(0) 


L 

2  0,1  cos{diko  cos  0)  cos  ai  +  di  sm{diko  cos  0)  sin  ai 

i=i 

+  bi  cos{diko  cos  0)  cos  /d/  cos  (j)  +  bi  sm{diko  cos  0)  sin  jSi  cos  0 
+  Q  cos{diko  cos  0)  cos  7^  sin  <p  +  ci  sm{diko  cos  0)  sin  sin  0 


L 

2  cos((i;/co  cos  0) 
«=1 

+  sin((i;/co  cos  0) 


Oi  cos  q;^  +  bi  cos  j3i  cos  0  +  Q  cos  7;  sin  0 
sin  5;  +  bi  sin  0;  cos  0  +  Q  sin  7^  sin  0j 


L 

2  cos((izA;o  cos  0) 

«=i 

+  sin((i;/co  cos  0) 


cos  0  +  sin  0 
df  +  b[  cos  0  +  cf  sin  0 


(2.1.18) 


The  last  step  changes  from  a  magnitude/phase  parameterization  to  a  real/imaginary 
parameterization,  using  the  superscripts  ^  and  ^  indicate  the  real  and  imaginary  parts 
of  the  weights.  Four  aspects  of  Equation  2.1.18  are  worth  noting.  First,  the  conjugate 
symmetry  reduces  the  number  of  (real)  variables  from  6N  to  3N.  Second,  for  a  linear 
vector-sensor  array  with  elements  spaced  uniformly,  d  units  apart,  di  =  {l—\)d.  Third, 
the  mapping  from  the  reduced  variables  in  Equation  2.1.18  to  the  full,  conjugate 
symmetric  weight  is  a  linear  transformation.  Fourth,  the  derivation  of  Equation 
2.1.18  assumes  even  N  but  is  easily  modihed  for  odd  N . 

The  beampattern  in  Equation  2.1.18  is  a  simple  inner  product,  the  real  counterpart 
to  Equation  2.1.5.  The  single-sensor  weight  and  measurement  terms 


V00)  =  2 


-  - 

1 

cos{diko  cos  0) 

0 

cos  0 

sm{diko  cos  0) 

sin0 

(2.1.19) 


-  A 

W;  = 


-n 


h'l 


7/^ 


hi 


0f 


(2.1.20) 


are  the  real  counterparts  to  Equations  2.1.10  and  2.1.11.  Concatenating  these  terms 
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yields  the  full  array  vectors 


V(0)  =  [vf(0)  V^(0)  • 

(2.1.21) 

W  =  [  wf  W2 

— T  1  ^ 

■■  J  , 

(2.1.22) 

which  are  the  real  counterparts  to  Equations  2.1.3  and  2.1.4.  Writing  Equation  2.1.18 
as  a  real  inner  product  gives 


1/(0)  =  w^v(0)  (2.1.23) 

y  =  V^w,  (2.1.24) 

the  real  counterparts  to  Equation  2.1.5  and  2.1.6,  respectively. 

Although  the  beampattern  in  Equation  2.1.18  cannot  be  simplihed  further  without 
restrictive  assumptions,  there  is  another  way  to  exploit  the  symmetry  of  the  array. 
The  even/odd  symmetry  of  the  cosine/sine  functions  allows  any  beampattern  to  be 
“mirrored”  easily  to  any  of  the  four  quadrants.  A  given  beampattern  is  mirrored 
across  the  array  axis  by  changing 


I-)- 

H- 

-2f 

(2.1.25) 


This  transformation  negates  the  cross-axial  component  and  yields  the  same  response 
as  the  original  weight  on  the  opposite  side  of  the  array.  A  similar  transformation 
allows  cross- axial  mirroring,  or  mirroring  from  forward  to  aft: 


of 

I-)- 

-af 

hf- 

I-)- 

-bf  ■ 

(2.1.26) 

-2f 

In  this  case,  the  sign  changes  are  a  combination  of  conjugation  and  negation  of  the 
axial  component.  Performing  both  axial  and  cross-axial  mirroring,  one  beam  is  mir- 
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— TT  — 37r/4  — 7r/2  — 7r/4  0  7r/4  7r/2  37r/4  tt 

Angle  (radians) 

Figure  2.1.1:  A  VSA  beampattern  is  easily  “mirrored”  to  any  quadrant 


rored  to  any  of  the  four  quadrants.  Figure  2.1.1  provides  an  example  beampattern 
that  is  mirrored  from  one  quadrant  to  the  other  three.  The  beampattern  shown  is 
for  a  uniform  linear  vector-sensor  array  with  N  =  10,  0o  =  ~7J'/4,  and  /  =  5/7/^. 
In  addition  to  being  linear  transformations,  the  mirroring  operations  only  involve 
sign  changes.  Mirroring  allows  efficient  conventional  beamforming  because  a  single 
quadrant  of  partial  sums  from  each  sensor  type  forms  a  full  set  of  beams  spanning  all 
quadrants  with  only  sign  changes.  Mirroring  also  reduces  the  effort  required  to  design 
a  set  of  beams  by  a  factor  of  four. 

2.2  Robustness  and  the  Sensitivity  Factor 

The  sensor  and  propagation  models  used  in  array  processing  often  contain  appreciable 
errors,  or  “mismatch.”  A  signihcant  source  of  mismatch  is  imperfection  in  the  array 
itself:  the  exact  gains,  phases,  positions,  and  orientations  of  the  sensors  are  unknown. 
The  “sensitivity  factor”  of  a  weight  vector  quantihes  its  robustness  to  these  modeling 
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errors.  The  (normalized)  sensitivity  factor  of  a  VSA  weight  vector,  w,  is 

^  A  H  H 

^  =  Vq  Vq  w  w 

=  2N  w^w.  (2.2.1) 

For  weights  subject  to  a  unity  gain  constraint,  the  Cauchy- Schwarz  inequality  implies 
.^  >  1.  The  sensitivity  factor  fully  characterizes  the  deviation  of  a  pressure-sensor 
array  beampattern  under  Gaussian  errors  (see  [5,  §2.6.3]).  The  relationship  is  more 
complex  for  vector-sensor  arrays  ([15]),  but  ^  remains  an  effective  surrogate  measure 
for  the  sensitivity  of  a  weight  vector  to  mismatch.  Robustness  generally  decreases  as  ^ 
increases,  so  constraining  ^  to  be  small  provides  robustness  in  adaptive  beamforming 
[10].  Section  4.3.3  applies  a  similar  technique  to  hxed  weight  design. 

2.3  Properties  of  Linear  Vector-Sensor  Arrays 

Because  a  pressure-sensor  array  is  a  subset  of  any  vector-sensor  array,  it  seems  possible 
that  many  useful  properties  of  linear  pressure-sensor  arrays  extend  to  vector-sensor 
arrays.  However,  the  additional  complexity  of  vector-sensors  makes  it  necessary  to 
re-examine  these  properties  because  many  require  modification  or  do  not  apply. 

2.3.1  Local  Fourier  Transform  Property 

One  useful  property  of  linear  pressure-sensor  arrays  is  that  the  beampattern  is  simply 
the  discrete  Fourier  transform  of  the  weights.  This  relationship  enables  using  Fourier 
transform  properties  and  the  tools  of  FIR  hltering  in  design  and  analysis.  The  hrst 
entry  in  Equation  2.1.10,  corresponding  to  the  pressure-sensor,  is  the  complex  expo¬ 
nential  of  a  Fourier  transform  vector.  For  the  other  components,  however,  the  Fourier 
transform  relationship  does  not  hold  because  of  the  sin  0  and  cos  0  terms  outside  the 
exponential. 

Although  the  exact  Fourier  transform  relationship  is  not  valid  for  vector-sensor 
arrays,  a  useful  “local”  Fourier  transform  property  exists.  A  small  region  around  any 
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nominal  point  in  a  linear  VS  A  beampattern  behaves  like  a  Fourier  transform.  The 
gain  of  the  directional  sensors  is  approximately  constant  around  the  nominal  angle. 
Treating  the  directional  sensors  as  constant  introduces  some  error,  Av(0),  into  the 
array  manifold  vectors.  The  error  in  the  resulting  beampattern  is 

|A|/(0)|  =  |w^[Av(0)]| 

<  |w|  |Av(0)|.  (2.3.1) 

The  bound  in  Equation  2.3.1  arises  from  the  Cauchy-Schwarz  inequality  and  is  not 
necessarily  tight.  If  the  sensitivity  factor  is  bounded,  ^  the  magnitude  of  the 

weight  is  bounded,  |w|  <  a/\/2N .  The  error  in  the  manifold  vector  comes  only  from 
errors  in  the  directional  terms  in  Equation  2.1.18: 

L 

|Av(0)p  =  4  E  cos‘^{diko  cos0)(A  cos  0)^  +  cos^{diko  cos  0)(A  sin  0)^ 

i=i 

+  sin^  {di ko  cos  0)  (A  cos  0) ^  +  sin^  {di ko  cos  0)  (A  sin  0) ^ 

=  2A  ■  [(Acos0)^  +  (Asin0)^].  (2.3.2) 

Substituting  the  bound  on  |w|  and  the  expression  for  |Av(0)|  into  Equation  2.3.1 
gives 


|A|/(0)|  <  ■  [(A cos 0)2  +  (A sin 0)2] 


9  • 

=  2a  sm  - 


<  a  ■  (A0). 


(2.3.3) 

(2.3.4) 


The  last  inequality  is  tight  near  the  nominal  angle.  Equations  2.3.3  and  2.3.4  are 
useful  for  two  reasons.  First,  they  prove  that  in  a  small  region  (A0  much  less  than 
a  beamwidth)  around  any  point,  the  beampattern  approximately  equals  a  weighted 
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Fourier  transform.  Because  the  pressure-sensor  beampattern  is  a  Fourier  transform, 
the  vector-sensor  beampattern,  around  a  given  angle,  behaves  like  a  pressure-sensor 
beampattern.  Second,  because  the  deviation  of  the  pressure-sensor  beampattern  be¬ 
tween  two  sample  points  is  bounded.  Equations  2.3.3  and  2.3.4  prove  that  the  devi¬ 
ation  of  the  vector-sensor  beampattern  between  two  sample  points  is  also  bounded. 
In  Chapter  4,  this  bounded  deviation  allows  a  vector-sensor  beampattern  to  be  ap¬ 
proximated  by  a  hnite  set  of  points  with  negligible  error. 


2.3.2  No  Modulation  or  “Steering”  Property 

The  Fourier  transform  relationship  between  the  weights  and  the  beampattern  for  a 
linear  pressure-sensor  array  has  many  useful  implications.  One  such  implication  is 
that  modulating  the  phase  of  the  weights  “steers”  an  arbitrary  beampattern  (viewed 
in  cosine-space)  to  any  angle.  The  steered  beampattern  and  the  original  beampattern 
have  the  same  shape  in  cosine-space.  In  practice,  the  steering  property  means  that 
only  one  real  weight  -  a  taper  -  designed  at  array  broadside  is  sufficient  to  form 
identical  beams  anywhere. 

As  useful  as  this  property  is  for  linear  pressure-sensor  arrays,  it  does  not  apply  to 
linear  vector-sensor  arrays.  Like  the  Fourier  transform  relationship,  the  modulation 
property  takes  a  modihed,  weakened  form  with  vector-sensor  arrays.  Separating  the 
vector-sensor  measurements  into  phase  and  gain  components  reveals  that  1)  the  phase 
component  exhibits  a  modulation  property  in  cosine-space  like  a  linear  pressure¬ 
sensor  array,  and  2)  the  gain  component  is  rotated  in  angle-space  by  Euler  rotations. 
Although  each  rotation  is  a  linear  transformation  of  the  weight  (or  equivalently,  the 
data),  the  gain  and  phase  components  cannot  be  separated  by  a  linear  system.  Thus, 
no  linear  transformation  steers  a  vector-sensor  array  beampattern  to  an  arbitrary 
direction.  Although  the  lack  of  a  steering  property  means  each  beam  must  be  designed 
separately,  the  “mirroring”  techniques  illustrated  in  Figure  2.1.1  provide  a  useful  way 
to  reduce  the  design  burden. 
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2.3.3  Non-Polynomial  Beampattern 


Another  useful  uniform,  linear,  PSA  result  associated  with  the  Fourier  transform 
property  is  that  the  beampattern  is  a  polynomial  function  of  the  variable 
=  cos((i/co  cos  0).  This  well-known  result  is  easily  seen  from  Equation  2.1.18  by 
removing  the  directional  elements,  assuming  real  weights,  and  applying  a  Chebyshev 
polynomial  relation  to  each  nested  cosine  term.  The  polynomial  form  of  the  uniform, 
linear,  PSA  beampattern  forms  the  foundation  of  many  tools  including  Chebyshev, 
Taylor,  and  Villeneuve  tapers  and  the  Parks-McClellan  algorithm  [5,  §3.4  and  3.6]. 
Such  tools  apply  polynomial  factorizations  or  approximations  to  the  beampattern  in 
2:- space. 

Unfortunately,  the  vector-sensor  array  beampattern  does  not  have  a  similar  poly¬ 
nomial  form.  For  a  polynomial  beampattern  representation  to  be  useful,  it  must  be  an 
unconstrained  polynomial  in  some  real  function  z{(j)).  The  following  discussion  sup¬ 
presses  the  dependence  of  on  azimuth  angle,  0,  when  convenient.  The  single-sensor 
case  illustrates  why  such  a  representation  does  not  exist  for  vector-sensor  arrays.  Two 
beampatterns  possible  with  a  single  vector-sensor  are  those  given  by  the  axial  and 
cross-axial  directional  sensors: 


r/o(0)  =  COS0  (2.3.5) 

r/i(0)  =  sin0.  (2.3.6) 

Assume  that  an  unconstrained  polynomial  representation  does  exist  for  some  function 
;2.  Because  a  2-D  vector-sensor  beampattern  involves  three  weights,  both  beampat¬ 
terns  must  correspond  to  unconstrained,  quadratic  polynomials  in  z,  that  is, 

cos^  =  a^z'^  +  boz  +  Cq  (2.3.7) 

sin^  =  aiz'^  +  biz  +  ci  (2.3.8) 

for  some  real  coefficients  Oq,  Oi,  bo,  etc.  Breaking  the  first  equation  into  the  Oq  7^  0 
and  oo  =  0  (quadratic  and  linear)  cases  and  solving.  Equation  2.3.7  constrains  the 
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function  to  lie  in  one  of  the  two  sets  of  functions 


Qo  — 


-&o  +  g(0)A/6o  -  4ao(co  -  cos0) 
2cin 


«o  ^ 

6q  —  4aQCo  >  |4ao| 


(2.3.9) 


-^n  — 


cos  0  —  Co 


Oo  =  0 


(2.3.10) 


The  sign  function  s(0)  takes  only  values  of  ±1.  To  reconstruct  both  beampatterns, 
any  function  is  further  constrained  by  Equation  2.3.8.  The  functions  in  Cq  are  even 
and  cannot  construct  the  odd  sine  function  via  the  composition  in  Equation  2.3.8. 
For  any  function  in  Qq  to  convey  sign  information  about  0,  the  sign  function  s(0) 
must  be  odd.  The  form  of  is  thus  restricted  to  a  constant  (even)  part  plus  the  odd 
part  involving  s(0).  The  even  part  of  the  function  must  become  identically  zero 
when  substituted  into  Equation  2.3.8,  leaving  the  requirement  that 


sin 


oc  s(0)A/rT 


a  cos  0 


(2.3.11) 


for  some  real  coefficient  a.  For  this  to  be  true  and  continuous  at  the  origin,  a  =  —1 
and 

I  sin  01  oc  a/1  —  cos  0.  (2.3.12) 

Because  this  is  clearly  not  true,  no  function  in  Qq  satishes  Equations  2.3.8,  i.e.  no 
function  satishes  both  Equations  2.3.7  and  2.3.8.  Thus,  no  unconstrained  polynomial 
form  exists  for  the  single-sensor  beampatterns  in  Equations  2.3.5  and  2.3.6.  Further¬ 
more,  because  these  beampatterns  are  possible  with  any  vector-sensor  array,  VS  A 
beampatterns  generally  do  not  have  an  unconstrained  polynomial  representation. 


Although  having  a  non-polynomial  beampattern  nullihes  the  polynomial-based 
techniques  listed  above,  it  does  not  mean  equivalent  results  are  impossible  with  acous¬ 
tic  vector-sensor  arrays.  As  Chapter  4  shows,  equivalent  beampatterns  are  achievable 
with  techniques  not  based  on  polynomial  functions. 
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2.3.4  Weights  With  Nonlinear  Element-Wise  Phase 


Weights  exhibiting  linear  element-wise  phase  are  another  useful  implication  of  the 
Fourier  transform  property  of  linear  pressure-sensor  arrays.  The  modulation  property 
makes  real  weights  designed  at  array  broadside  sufficient  for  use  at  any  cosine  angle. 
Real  weights,  when  modulated  to  another  cosine  angle,  become  complex  weights  with 
a  linear  element-wise  phase  progression. 

Because  every  replica  vector  on  the  linear  vector-sensor  array  manifold  exhibits 
a  linear  element-wise  phase  progression,  it  seems  that  the  weights  should  necessarily 
exhibit  this  property  as  well.  To  the  contrary.  Appendix  A. 2  suggests  that  vector¬ 
sensor  weights  need  not  have  linear  element-wise  phase.  Chapter  4  proves  the  exis¬ 
tence  of  such  weights  by  example:  many  of  the  custom-designed  weights  have  nonlin¬ 
ear  element-wise  phase  progressions.  Although  weights  with  nonlinear  element-wise 
phase  stray  from  the  concept  of  “spatial  tapering,”  they  often  perform  well  with 
vector-sensor  arrays.  Depending  on  the  design  problem,  forcing  VSA  weights  to  have 
linear  element-wise  phase  may  sacrihce  signihcant  performance. 


2.3.5  Nonlinear  Physical  Constraints 

A  hnal  property  of  vector-sensor  arrays  that  deserves  clarification  is  the  nonlinearity 
of  the  physical  constraints.  The  four  measurements  of  a  single  vector-sensor  are 
somewhat  redundant.  The  omnidirectional  sensor  measures  the  pressure  held;  the 
directional  sensors  measure  the  gradient  of  the  pressure  held. 

Although  it  seems  that  this  redundancy  should  be  easy  to  exploit,  its  nonlinear 
nature  leads  to  complications.  Even  in  the  simplest  case  of  the  single  plane-wave 
source,  the  measurements  are  related  quadratically  by  power.  For  a  single  plane- 
wave  source,  the  sum  of  the  power  measured  by  the  directional  sensors  equals  the 
total  power  measured  by  the  omnidirectional  sensor.  With  multiple  sources,  the 
relationship  becomes  even  more  complex  and  nonlinear.  Full  exploitation  of  such 
physical  constraints  requires  nonlinear  techniques  such  as  [12]. 
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2.4  Spatially  Spread  Sources  with  Linear  VS  As 


It  is  common  scientific  practice  to  simplify  problems  by  discretizing  distributions: 
point  masses  in  physics,  impulses  and  pure  sinusoids  in  signal  processing,  and  point 
sources  in  acoustics.  Although  these  approximations  are  often  extremely  accurate, 
they  are  sometimes  misleading.  Modeling  spatially  distributed  sounds  as  point  sources 
occasionally  predicts  performance  that  is  unachievable  in  practice.  To  avoid  such 
a  pitfall,  this  section  derives  integrals  and  approximations  for  2-D  spatially  spread 
sources  as  observed  by  a  linear  vector-sensor  array.  The  2-D  vector-sensor  array  is 
modeled  for  simplicity,  but  extending  these  results  to  3-D  is  discussed  where  applica¬ 
ble.  Assuming  all  spatial  processes  are  zero-mean  Gaussian,  the  quantity  of  interest 
is  most  generally  the  covariance  between  two  sensors.  Because  a  2-D  vector-sensor 
measures  three  quantities,  this  covariance  is  size  3x3  for  a  single  pair  and  3N  x  3N 
for  an  array  of  N  sensors. 

Figure  2.4.1  provides  a  notional  comparison  of  point  and  spread  sources.  The 
point  source  corresponds  to  the  impulsive  spatial  distribution  denoted  by  the  gray 
arrow  in  the  top  plot.  The  response  of  a  vector-sensor  array  to  this  point  source  gives 
the  familiar  beampattern  shown  in  the  bottom  plot.  The  spatially  spread  source,  by 
contrast,  corresponds  to  a  uniform  spatial  distribution  in  cosine-space  on  the  star¬ 
board  side  of  the  array.  The  array  response  to  both  distributions  exhibits  sidelobe 
structure  because  of  the  hnite  aperture  and  “backlobe”  structure  because  of  the  pres¬ 
sure  ambiguity.  The  spread  source  integrates  power  over  a  range  of  angles,  “hlling-in” 
nulls  and  widening  the  array  response.  The  spatial  spreading  in  Figure  2.4.1  is  exag¬ 
gerated  to  illustrate  its  effects  on  the  array  response;  spatial  distributions  are  often 
more  concentrated  than  the  hgure  suggests. 

2.4.1  General  Integral  Form 

Although  spatially  spread  sources  are  unexplored  with  linear  vector-sensor  arrays, 
they  are  common  with  linear  pressure-sensor  arrays.  Because  a  pressure-sensor  array 
is  a  subset  of  a  vector-sensor  array,  this  thesis  carefully  chooses  a  spatial  spread- 
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Figure  2.4.1:  Notional  response  of  a  VSA  to  point  and  spread  sources 


ing  model  consistent  with  the  decades  of  vetted  pressure-sensor  work.  This  section 
extends  the  model  presented  in  [5,  §8.9],  analyzing  an  azimuthal  distribution  of  un¬ 
correlated,  zero-mean,  Gaussian  sources.  The  distribution  is  specihed  in  terms  of 
azimuthal  cosine  (rather  than  angle),  keeping  with  convention,  encouraging  closed- 
form  expressions,  and  restricting  the  distribution  to  one  side  of  the  array.  The  results 
are  easily  extended  to  two-sided  distributions  by  expressing  any  two-sided  distribu¬ 
tion  as  the  sum  of  two,  one-sided  distributions.  Because  the  integrated  sources  are 
uncorrelated,  the  covariance  between  two  sensors  is  given  by  the  single  integral 

roi  =  y  p{u)vq{u)vI{u)  du  (2.4.1) 

where  u  =  cos  0  is  the  azimuthal  cosine,  p{u)  is  the  spatial  distribution  of  power,  and 
Vi{u)  are  the  responses  of  each  sensor  to  a  signal  at  u.  When  the  two  sensors  are  part 
of  a  linear  vector-sensor  array,  each  response  contains  a  gain  term  depending  only  on 
direction  and  a  phase  term  depending  on  both  position  and  direction.  If  the  sensor 
position  along  the  array  axis  is  x  and  the  gain  of  each  sensor  is  gi{u),  the  response  is 

n,(n,a;)4^,(«)e^to.  (2.4.2) 

The  gain  terms  for  the  geophone  elements  are  simply  the  azimuthal  sine  and  cosine 
expressed  in  terms  of  u.  Using  subscripts  o,  x,  y  for  the  omnidirectional,  inline,  and 
cross-axial  sensors,  these  gain  terms  are 


go{u)  =  1 

(2.4.3) 

gx{u)  =  u 

(2.4.4) 

±Vl 

(2.4.5) 

The  sign  of  gy{u)  changes  depending  on  the  side  of  the  array.  The  remainder  of  this 
section  assumes  gy{u)  is  positive,  corresponding  to  a  distribution  on  the  port  side  of 
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the  array.  Substituting  Equation  2.4.2  into  Equation  2.4.1  gives 


r{xo,Xi) 


p{u) Pq (u) gl (u)e^koxon^-jkox^u 


(2.4.6) 


Equation  2.4.6  is  easily  written  in  terms  of  the  distance  between  the  sensors,  5  ^  Xq  —  Xi, 
and  the  composite  gain  function  of  the  sensor  pair,  Goi('w)  —  gQ{u)gl{u): 

r+l 

r{6)  =  J  p{u)Goi{u)e^^°^^  du.  (2.4.7) 

Extending  the  covariance  function,  r{6),  to  3-D  vector-sensor  arrays  requires  no  addi¬ 
tional  work  as  the  elevation  terms  fall  outside  the  integral.  The  integral  in  Equation 
2.4.7  is  the  windowed  Fourier  transform  of  p{u)Goi{u),  so  a  closed  form  seems  possi¬ 
ble.  Unfortunately,  the  nnmber  and  variety  of  gain  fnnctions  make  obtaining  closed 
forms  for  all  integrals  very  difhcnlt  with  a  given  spatial  distribntion.  The  exact 
integral  form  in  Eqnation  2.4.7  does,  however,  admit  several  nsefnl  and  insightfnl 
approximations. 


2.4.2  Constant  Gain  Approximation 

The  simplest  and  most  nsefnl  approximation  to  Eqnation  2.4.7  arises  from  the  smooth 
natnre  of  the  gain  fnnctions  and  the  small  width  of  typical  spatial  spreading.  The 
standard  deviation  of  the  distribntion  is  nsnally  small  (less  than  5%  of  cosine-space) 
when  modeling  spatially  spread  sources.  Over  such  a  small  range  of  u,  the  gain  fnnc¬ 
tions  are  well-approximated  as  constant.  This  constant  gain  approximation  yields  a 
simple  bnt  powerfnl  model  for  vector-sensor  spatial  spreading  using  covariance  matrix 
tapers. 

When  the  sensor  gains  are  approximated  as  constant,  incorporating  spatial  spread¬ 
ing  simply  modulates  the  existing  covariance  fnnction.  Withont  loss  of  generality. 
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assume  the  spatial  distribution  has  mean  uq  and  is  a  shifted  version  of  the  zero-mean 
distribution  po{u).  Applying  the  constant  gain  approximation  to  Equation  2.4.7  at 
uq  allows  the  gain  terms  to  be  taken  outside  the  integral: 

ri6)  ^  go{uo)gi{uo)  J  po{u  -  du.  (2.4.8) 


Equation  2.4.8  is  simplihed  in  two  steps.  The  hrst  step  is  extending  the  range  of 
the  distribution  to  include  the  entire  real  line.  The  extended  region  u  ^  [— 
is  referred  to  as  “virtual”  space  because  it  does  not  correspond  to  real  azimuthal 
angles.  It  does,  however,  provide  a  natural  extension  for  spatially  spread  sources  at 
array  endhre,  where  the  distribution  extends  into  virtual  space.  The  second  step  is 
utilizing  a  Fourier  transform  property  to  simplify  the  integral.  Translation  in  the 
u  domain  corresponds  to  a  phase  shift  in  the  6  domain.  Applying  both  steps  to 
Equation  2.4.8  gives 

/-l-oo 

Po{u  -  uo)e>^°^'^du 

■OO 

=  •  Po{ko5),  (2.4.9) 


where 


(2.4.10) 


is  the  Fourier  transform  of  the  distribution  Pf){u).  Equation  2.4.9  is  divided  into  two 
terms.  The  hrst  term  is  the  original  covariance  function  without  spatial  spreading. 
The  effects  of  spatial  spreading  appear  as  a  modulation  by  the  second  term,  Po(^o<^)- 
This  modulating  term,  or  “spread  function,”  is  independent  of  the  source  location 
given  by  the  mean,  uq.  Because  it  is  a  Fourier  integral,  the  spread  function  often 
has  a  closed  form.  Two  common  and  tractable  choices  for  po{u)  are  the  uniform  and 
Gaussian  distributions.  These  distributions  and  their  associated  spread  functions  are 
summarized  in  the  table 
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Po{u) 

PoikoS) 

Uniform 

j  l/(2a„\/3)  \u\  <  auVS 

1  0  otherwise 

sinc{ko6au'/S) 

Gaussian 

exp{-{koSauf/2} 

where  sinc(-)  is  the  unnormalized  sine  function  and  is  the  standard  deviation  of 
the  distribution  in  cosine-space.  Equation  2.4.9  reveals  that  1)  the  effects  of  spa¬ 
tial  spreading  are  well-approximated  by  modulation  and  2)  the  modulating  spread 
function  does  not  depend  on  source  location. 

For  an  array  of  sensors,  the  constant  gain  approximation  enables  modeling  spatial 
spreading  with  covariance  matrix  tapers.  Each  entry  of  the  covariance  matrix  (for 
a  single  source)  is  Equation  2.4.9  evaluated  at  the  correct  inter-element  distance. 
Separating  the  terms  of  Equation  2.4.9  reveals  that  modulating  the  point  source 
covariance  matrix,  R,  approximates  the  spatially  spread  covariance  matrix,  R^: 

Rs^RQP.  (2.4.11) 

The  modulation  matrix,  P,  is  given  by  the  spread  function  and  does  not  depend  on 
the  contents  of  R.  By  linearity,  any  covariance  matrix  that  is  the  sum  of  a  (possibly 
inhnite)  number  of  point  sources  is  approximately  “spread”  by  applying  the  same 
modulation  matrix.  The  matrix  P  is  often  referred  to  as  a  “covariance  matrix  taper” 
because  of  its  similarity  to  temporal  or  spatial  tapering.  The  three  components  of 
each  vector-sensor  are  co-located,  so  the  covariance  matrix  taper  for  a  vector-sensor 
is  simply  an  extension  of  the  taper  for  omnidirectional  sensor. 


P2D-VSA  —  PpSA  <8)  IsxS- 


(2.4.12) 


For  a  3-D  VS  A,  the  3x3  matrix  of  ones  is  replaced  by  a  4  x  4  matrix  of  ones. 
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Figure  2.4.2  reveals  how  accurately  the  constant  gain  approximation  models  uni¬ 
form  spatial  spreading.  The  hgure  illustrates  response  patterns  for  an  iV  =  10  ele¬ 
ment,  2-D  vector-sensor  array  at  its  design  frequency.  As  shown  in  the  hgure,  both 
the  constant  gain  approximation  and  the  exact  integral  expand  the  point  source  and 
“hll-in”  the  nulls  in  the  response  pattern.  For  a  typical  case  with  reasonably  small 
spreading  away  from  array  endhre,  the  approximation  is  almost  indistinguishable  from 
the  exact  integral.  In  a  more  extreme  case  with  large  spreading  near  array  endhre, 
the  errors  introduced  by  the  approximation  are  minor  but  visible.  The  approximation 
is  less  accurate  at  endhre  where  the  sensor  gains  may  change  rapidly  with  u.  If  the 
“extreme”  case  were  moved  any  closer  to  endhre,  spreading  would  extend  into  virtual 
space;  the  approximation  would  be  useful  but  the  exact  integral  would  be  undehned. 

2.4.3  Second-Order  Gain  Approximation 

The  previous  section  applies  a  zeroth-order  Taylor  series  approximation  to  the  sensor 
gains,  i.e.  the  gains  are  approximated  as  constant.  This  section  explores  higher- 
order  approximations  and  develops  a  second-order  approximation  for  uniform  spatial 
spreading.  Higher-order  approximations  become  increasingly  accurate  at  the  expense 
of  analytical  simplicity.  Any  approximation  greater  than  zero-order  loses  the  simplic¬ 
ity  and  power  of  the  covariance  matrix  taper  interpretation. 

A  closed  form  expression  is  hrst  derived  for  any  polynomial  gain  function.  Con¬ 
sider  the  n*’^-order  gain  function 


GM^iu-uoT, 


(2.4.13) 


which  is  a  simple  monomial  in  u  ^  u  —  Uq.  Extending  the  integral  into  virtual  space 
transforms  the  covariance  function  (Equation  2.4.7)  into 


^-|-CXD 


rn{5)  =  /  Gn{u)p{u)G’^°^'^du 


^-|-CXD 


(u  —  uo)^  po{u  —  du. 


(2.4.14) 
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Figure  2.4.2:  Constant  gain  approximation  to  uniform  spatial  spreading 
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Equation  2.4.14  is  an  inverse  Fourier  transform,  so  1)  translation  in  one  domain 
corresponds  to  a  phase  shift  in  the  other  and  2)  modulation  by  a  monomial  in  one 
domain  corresponds  to  differentiation  in  the  other.  Applied  in  sequence  to  Equation 
2.4.14,  these  properties  yield  the  closed  form 

/+00 

-CXD 

=  (2.4.15) 

where  is  the  derivative  of  Pq  with  respect  to  its  argument.  Any  gain 
function  that  is  an  n*’^-order  polynomial  is  expressible  as  a  linear  combination  of 
Go(w),  Gi{u),  . . . ,  Gniu).  It  therefore  has  a  closed  form  covariance  function  as  a 
linear  combination  of  ro(S),  ri(6),  . . . ,  rn((5). 

For  acoustic  vector-sensors,  a  second-order  Taylor  series  is  a  convenient  approx¬ 
imation  to  the  gain  functions  because  most  terms  are  exact.  Six  composite  gain 
functions  must  be  integrated  to  £11  each  3x3  block  in  the  covariance  matrix; 


Goo{u) 

=  1 

(2.4.16) 

Gqx  (^) 

=  u 

(2.4.17) 

G XX  (^) 

= 

(2.4.18) 

Gyy{u) 

=  l-u^ 

(2.4.19) 

Goy{u) 

=  Vl-U^ 

(2.4.20) 

Gxy  (^) 

=  uVl  —  u^- 

(2.4.21) 

A  second-order  Taylor  series  is  exact  for  the  first  four  terms.  Over  a  small  region  of 
u,  the  last  two  terms  are  well- approximated  by  second-order  expansions  about  uq, 

Goy(u)  ^  J^-ul - ,  ^  {u  -  Uo)  -  _^^l3/2  ~  (2.4.22) 

y'  i  —  O  “oJ 

ayu)  «  u,JTP4  +  -  «o)  +  (2.4.23) 
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Obtaining  closed  form  covariance  functions  for  this  second-order  approximation  re¬ 
quires  the  derivatives  P^\z)  for  n  =  1,2.  These  derivatives  are  easily  computed 
for  many  distributions,  but  only  uniform  spreading  is  included  here.  For  uniform 
spreading  of  width  A  =  Section  2.4.2  gives 


^o(z)  =  jo(^A). 


(2.4.24) 


The  zeroth-order  spherical  Bessel  function,  jo(‘))  is  equivalent  to  the  unnormalized 
sine  function.  Derivatives  of  the  spherical  Bessel  functions  are  related  through  the 
following  recursion  [16,  10.1.20]: 


d  n 

^  Am 


(2.4.25) 


This  recursion  enables  writing  each  Plf^\z)  as  a  linear  combination  of  the  first  n 
spherical  Bessel  functions.  Applying  Equation  2.4.25  gives  the  hrst  two  derivatives. 


j-'A'w 


(2.4.26) 


J  =  j 

.-1  d  r,.. 


=  J 


dz 


jAji{zA) 


=  am 


(2.4.27) 


Substituting  these  derivatives  into  Equation  2.4.15  yields 


ro(5) 

=  ■  jo{ko5A) 

(2.4.28) 

ri(<5) 

=  eP^^^^-jAUkoSA) 

(2.4.29) 

^2(5) 

=  ^UkoSA)-^UkoSA)  . 

(2.4.30) 
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The  gain  functions  in  Equations  2.4.16-2.4.19,  2.4.22,  and  2.4.23  are  second-order 
polynomials,  so  their  covariance  functions  require  only  the  hrst  three  r„((5): 


roo{,5) 

=  ro{S) 

(2.4.31) 

rox{^) 

=  uoro{6)  +  ri{6) 

(2.4.32) 

^xx 

=  ulro{6)  +  2uori{6)  +  r2{S) 

(2.4.33) 

=  {1  -  ul)ro{S)  -  2uori{S)  -  r2{S) 

(2.4.34) 

ToyiS) 

(2.4.35) 

^xy{^) 

(2.4.36) 

Although  it  is  not  done  here,  substituting  r„(5)  into  these  equations  yields  closed 
forms  in  terms  of  the  hrst  three  spherical  Bessel  functions. 

The  covariance  functions  obtained  with  the  second-order  approximation  are  more 
accurate  but  more  complex  than  those  obtained  with  the  constant  gain  approxima¬ 
tion.  The  complexity  of  the  second-order  approximation  is  evident  in  the  lengthy 
covariance  functions  and  the  coupling  of  spreading  width.  A,  and  mean,  uq.  This 
coupling  makes  a  covariance  matrix  taper  interpretation  impossible  and  impedes  anal¬ 
ysis.  The  accuracy  of  the  second-order  approximation  is  shown  in  Figure  2.4.3  for  the 
same  cases  as  Figure  2.4.2.  Comparing  the  two  hgures  reveals  that  the  second-order 
approximation  introduces  little  error,  even  with  large  spreading  at  endhre  where  the 
Taylor  series  is  less  accurate  (see  Section  2.4.2).  For  the  purposes  of  this  document, 
the  constant  gain  approximation  is  chosen  hereafter  for  its  simplicity,  intuition,  and 
reasonable  accuracy. 


56 


Array  Response  Power  (dB)  Array  Response  Power  (dB) 


Point  Source - Approximation  - Exact 


Angle  (radians) 


Figure  2.4.3:  Second-order  approximation  to  uniform  spatial  spreading 
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Chapter  3 


Performance  Limits  of 
Vector- Sensor  Arrays 


Section  1.4.1  demonstrates  that  the  bounds  commonly  employed  in  pressure-sensor 
array  processing  do  not  necessarily  reflect  the  improvements  offered  by  vector-sensors. 
To  understand  this  discrepancy  and  motivate  alternative  bounds,  the  following  para¬ 
graphs  formalize  the  process  by  which  existing  bounds  arose. 

The  first  step  in  establishing  a  bound  is  identifying  the  relevant  performance 
dimension.  This  performance  dimension  is  often  revealed  by  the  engineering  problem 
itself.  Pressure-sensor  arrays  developed  to  localize  sound,  so  a  natural  performance 
dimension  is  angular  resolution.  Viewed  another  way,  arrays  of  sensors  are  employed 
to  amplify  sound  from  a  specific  direction,  so  another  popular  performance  dimension 
is  array  gain. 

The  second  step  toward  a  useful  bound  is  choosing  a  scenario  or  model  that  accu¬ 
rately  represents  the  given  performance  dimension.  Like  a  good  scientific  experiment, 
the  scenario  should  isolate  the  performance  dimension  as  a  dependent  variable  and 
clearly  define  the  independent  variables.  Useful  scenarios  often  fall  into  two  categories: 
classification  problems  and  estimation  problems.  For  localization,  the  standard  sce¬ 
nario  is  estimating  the  direction-of-arrival  of  signals  in  noise;  for  gain,  it  is  detecting 
the  presence  of  signals  in  noise. 
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The  third  and  hnal  step  is  deciding  the  type  of  bound  and  applying  it  to  the 
chosen  scenario.  This  decision  depends  on  the  goals  of  the  researcher,  the  complexity 
of  the  problem,  the  required  strength  or  “tightness”  of  the  bound,  and  the  scenario 
itself.  Each  bound  has  an  analytical  complexity  associated  with  the  derivation  and  a 
computational  complexity  associated  with  its  numerical  evaluation. 

The  three-step  process  outlined  above  also  helps  establish  useful  performance 
bounds  for  vector-sensor  array  processing.  For  the  hrst  step.  Section  1.3  illustrates 
that  the  relevant  performance  dimension  for  vector-sensor  arrays  is  resolution  of  pres¬ 
sure  ambiguities.  Sections  3.1  and  3.2  proceed  through  the  second  and  third  steps 
to  establish  both  a  classihcation-based  bound  and  an  estimation-based  bound.  The 
objective  of  this  chapter  is  not  an  exhaustive  treatise  on  vector-sensor  performance 
bounds;  the  objective  is  motivating  the  study  of  non-traditional  problems  that  better 
illustrate  vector-sensor  capabilities. 

The  key  contribution  of  this  chapter  is  a  new  theoretical  foundation  for  vector¬ 
sensor  performance.  The  two  performance  bounds  developed  in  this  chapter  are 
distinct,  but  the  conclusion  is  the  same:  linear  vector-sensor  arrays  are  theoretically 
able  to  resolve  left/right  ambiguities  very  well  from  almost  any  direction,  at  any 
frequency,  and  with  any  number  of  elements.  The  performance  also  seems  to  be 
robust,  not  relying  on  point  nulls  or  “supergain.” 


3.1  Novel  Classification-Based  Bound 

One  useful  scenario  that  highlights  the  vector-sensor  array’s  ability  to  resolve  pressure 
ambiguities  is  illustrated  in  Figure  3.1.1.  In  this  scenario,  a  narrowband  sound  source 
is  positioned  at  one  of  two  possible  locations  relative  to  a  vector-sensor  array.  Both 
locations  are  chosen  to  yield  identical  pressure  measurements,  e.g.  the  left  and  right 
sides  of  a  uniform  linear  array.  Because  these  locations  are  ambiguous  to  a  pressure¬ 
sensor  array,  this  model  isolates  the  vector-sensor  performance.  Under  this  setup,  the 
array  measures  K  independent  observations  of  the  source  corrupted  by  additive  noise 
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Linear  VSA 


Hypothesis  Hi 
Ri  =  cT^v(-0)v^(-0)  +  R„ 


Vector-Sensor 


Hypothesis  Hq 
Ro  =  +  R„ 


Classify  data  as  either. . . 

o  Hypothesis  Hq  (right  side) 
o  Hypothesis  Hi  (left  side). 


Figure  3.1.1:  Example  right/left  (binary)  classification  scenario 


and/or  interference.  Each  observation,  x,  is  given  by 

x  =  x^  +  x„  (3.1.1) 

where  x^  and  x„  are  the  signal  and  noise  measurements,  respectively.  To  simplify 
expressions,  the  observation  column  vectors  are  horizontally  concatenated  into  the 
4iV  X  K  matrix  X,  which  is  also  used  to  form  the  sample  covariance  matrix  R; 

(3.1.2) 

R  =  i.xx".  (3.1.3) 

K 

The  source  is  complex  zero- mean  Gaussian  with  known  power  the  noise  is  complex 
zero-mean  Gaussian  with  known  covariance  E{x„x^}  =  R„.  The  two  hypotheses, 
Hq  and  Hi,  have  respective  prior  probabilities  ttq  and  tti.  The  source  replica  vector 
is  either  vq  or  vi,  respectively. 

The  above  scenario  is  a  binary  hypothesis  test  because  there  are  only  two  classes. 


X  = 


Xi  X2 


Xif 
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For  binary  hypotheses,  the  log-likelihood  ratio  test 


‘/7i’ 

lnA(X)  =  lnpx|H(X|//i)-lnpx|H(X|//o)  ^  In  ^  (3.1.4) 

minimizes,  and  thus  bounds,  the  probability  of  error.  Although  this  is  a  binary 
hypothesis  test,  it  is  not  the  standard  detection  problem  because  there  is  no  null 
hypothesis  in  which  the  source  is  absent. 

3.1.1  Derivation 

Having  formally  stated  the  scenario  and  chosen  the  type  of  bound,  the  task  is  now  to 
derive  an  expression  for  the  minimum  probability  of  error.  The  procedure  follows  close 
to  [17],  but  modihed  here  for  the  complex  distributions  and  matrix  measurements. 
The  derivation  is  in  four  steps:  1)  forming  the  log-likelihood  ratio  test,  2)  establishing 
the  characteristic  functions  of  the  test  statistic  under  both  hypotheses,  3)  evaluat¬ 
ing  the  cumulative  distribution  functions  from  the  characteristic  functions,  and  4) 
expressing  the  minimum  probability  of  error  in  terms  of  the  cumulative  distribution 
functions. 

The  hrst  step  is  deriving  an  expression  for  the  log-likelihood  ratio  test.  Under 
hypothesis  Hi,  i  G  {0, 1},  the  data  matrix,  X,  is  zero-mean  complex  Gaussian  with 
known  covariance  matrix 

R*  =  (3.1.5) 

The  probability  density  for  X  under  Hi  is  a  function  of  the  sample  covariance  matrix 
only, 

7'r|h(R|R*)  =  I^RiT^  ■  -Xtr(R~^R)  (3.1.6) 

where  |  ■  |  denotes  the  matrix  determinant.  The  log-likelihood  function  under  Hi  is 
then 

lnp^l^(R|Ri)  =  -A:in|7rRj  -  KtT{R;^R),  (3.1.7) 
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resulting  in  the  log-likelihood  ratio 


InA(R) 


K 


—  In  IttR^ 


tr(R^  ^R)  -|-  In  IvtRqI  -|-  tr(RQ  ^R) 


K  ■ 


1 1^0 1 

In  +  tr 

Ri 


:ro' 


(3.1.8) 


Substituting  this  expression  into  Equation  3.1.4  yields  the  log-likelihood  ratio  test 


K  ■  {In  -h  tr 

Ri 


(Ro '  -  Rh')R 


‘i/i’ 

‘//n’ 


,  TJ'O 
In  — 

TTl 


a:  -  tr 


Ro  ^  -  Rr^)R 


‘i/i’ 


In  —  —  ■  In  (3.1.9) 


TTl 


|Rl| 


The  test  is  written  more  compactly  after  dehning  a  few  terms: 


V 

Q 

/(R) 


In  —  —  ■  In 

TTl 

Ro '  -  Rr' 


K-ti 


‘/7i’ 

/(R)  ^  V- 


(3.1.10) 

(3.1.11) 

(3.1.12) 

(3.1.13) 


In  this  form,  the  scalar  function  /(R)  is  the  test  statistic  and  rj  is  the  threshold. 

The  second  step  in  the  derivation  Ends  the  characteristic  function  of  the  test 
statistic  under  each  hypothesis.  The  characteristic  function,  0i(-),  of  the  test  statistic 
under  hypothesis  Hi  is  given  by  the  Fourier  transform. 


Uu 


/  kR.r''- 

■  exp  j 

1 _ 1 

1 

/  kR.r'"- 

■  exp  j 

1 _ 1 

1 

R-^R 


exp  <  —ju  Kii 


QR 


Rj  ^  -h  jwQ)  R 


dX. 


dX 

(3.1.14) 
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To  find  a  closed  form  for  this  integral,  the  exponential  involving  R  is  converted  into 
the  form  of  a  complex  Gaussian  density  (Equation  3.1.6).  Dehning  the  covariance 
matrix 

+  juQ)~\  (3.1.15) 

the  determinant  becomes 


ttRJ  =  |7r(I  + jcaR^Q)  (1  + ja;RiQ)"^Ri| 

=  k(I  + jcuR^Q)  S| 

=  |I  +  jcaR^QI  ■  |7rS|.  (3.1.16) 


Incorporating  this  result  into  the  integral,  Equation  3.1.14,  obtains  a  closed  form  for 
the  characteristic  function: 

jdX 
jdX 

=  |I  +  ja;R,Qr^.  (3.1.17) 

The  characteristic  function  is  now  in  closed  form,  but  its  use  is  still  limited  because 
each  evaluation  involves  the  determinant  of  a  potentially  large  matrix.  Thankfully, 
evaluating  the  determinant  under  both  hypotheses  is  simplihed  through  the  use  of  a 
generalized  eigenvalue  decomposition.  Writing  out  the  characteristic  functions  under 
both  hypotheses  gives 


(ca)  =  /  |I  +  ja;RjQ|  ■  |7rS|  '^exp<(— Xtr 


S-^R 


=  |I  +  ja;RjQ|  I  |7rS|  exp  <1  —  Xtr 


-K  . 


S-^R 


M^)  =  |l  +  ja;Ri(Ro'-R7')| 

=  |l  +  jca(RiRo-'-I)|“'" 

(3.1.18) 

M^)  =  |l  +  ja;Ro(Ro'-Rr')r'^ 

=  |l  +  ja;(I-RoRr')r'" 

=  |l  +  ja;[I-(R,Ro-^)-i]|-'". 

(3.1.19) 
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Recalling  that  the  determinant  of  a  matrix  is  the  product  of  its  eigenvalues,  both 
characteristic  functions  are  expressible  in  terms  of  \m  the  eigenvalues  of 


01  (w) 


]^l+ja;(An  —  1) 

1  +  joj{l  —  1/ A„) 


(3.1.20) 

(3.1.21) 


These  A„  are  also  known  as  the  “generalized  eigenvalues”  of  the  pair  (Ri,  Rq).  Equa¬ 
tions  3.1.20  and  3.1.21  allow  both  characteristic  functions  to  be  evaluated  easily  from 
a  shared  eigenvalue  decomposition. 

The  third  step  in  deriving  a  bound  on  the  probability  of  error  is  evaluating  the 
cumulative  distribution  functions  from  their  associated  characteristic  functions.  The 
cumulative  distribution  function  of  the  test  statistic,  /(R),  is  related  to  the  charac¬ 
teristic  function  through  the  integral 


R)  <  nw]  =  t  -  i 


^OO  -1 

-lm{0i(a;)e^‘^^}  du 


(3.1.22) 


as  given  in  [18].  Numerical  evaluation  of  the  integral  in  Equation  3.1.22  is  complicated 
by  the  inhnite  upper  limit.  Equations  3.1.20  and  3.1.21,  however,  decrease  rapidly 
with  uj.  With  L  non-unity  eigenvalues,  the  integrand  decreases  asymptotically  like 
uj-^lk+i)  ^  Such  a  fast  decay  means  that  evaluating  the  integral  in  Equation  3.1.22 
to  high  precision  only  requires  a  hnite  and  reasonably  small  upper  limit.  Alterna¬ 
tive  approaches  such  as  partial  fraction  expansion  and  saddle-point  methods  yield 
higher  precision,  especially  in  the  tails  of  the  distribution,  but  are  unnecessary  for 
this  problem.^ 

The  trivial  last  step  in  the  derivation  is  expressing  the  probability  of  error  in 
terms  of  the  cumulative  distribution  functions  of  the  test  statistic.  For  this  binary 
hypothesis  test,  the  probability  of  error  is  determined  by  the  weighted  probability 
of  error  under  each  hypothesis.  The  total  error  is  easily  written  in  terms  of  the 

^All  results  shown  are  from  numerical  integration  verified  by  partial  fraction  decomposition. 
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cumulative  distribution  functions, 


TToP  {/(R)  >  v\Ho}  +  TTiP  {/(R)  <  v\Hi} 


1  -  P  |/(R)  <  v\Ho\\  +  TTiP  |/(R)  <  r^lRij 


(3.1.23) 


where  Pe  is  the  probability  of  error.  Summarizing  the  derivation,  the  minimum  prob¬ 
ability  of  error  for  the  classihcation  problem  illustrated  in  Figure  3.1.1  is  given  by 
Equation  3.1.23.  This  equation  is  evaluated  with  the  help  of  Equation  3.1.22  applied 
to  Equations  3.1.20  and  3.1.21. 


3.1.2  Analysis 

The  probability  of  error  bound  derived  above  involves  many  independent  variables: 
arrival  angle,  signal-to- noise  ratio  (SNR),  frequency,  number  of  sensors,  noise  distri¬ 
bution,  and  the  prior  probabilities  of  each  hypothesis.  This  subsection  analyzes  the 
bound  under  one  insightful  scenario  with  white  noise  and  equal  prior  probabilities. 
In  this  snbset  of  the  parameter  space,  a  simple  formula  illustrates  the  behavior  of  the 
bound  and  reveals  important  properties  of  linear  vector-sensor  arrays. 

With  white  noise  and  eqnal  prior  probabilities,  the  remaining  independent  vari¬ 
ables  are  arrival  angle,  SNR,  frequency,  and  number  of  sensors.  Holding  the  latter  two 
variables  constant  at  moderate  values  of  /  =  5/7/^  and  iV  =  10  sensors,  the  bonnd 
is  evalnated  as  a  fnnction  of  arrival  angle,  cos^,  and  array  SNR,  2Na‘l.  Contours  of 
the  resulting  image,  shown  in  Figure  3.1.2,  reveal  the  ability  of  a  linear  vector-sensor 
array  to  resolve  left/right  ambiguities.  The  results  are  symmetric  about  broadside 
and  endhre,  so  only  one  quadrant  is  shown.  The  most  apparent  feature  of  Figure  3.1.2 
is  that  the  performance  of  the  vector-sensor  array  is  effectively  uniform  over  a  large 
fraction  of  cosine-space.  The  SNR  reqnired  to  achieve  a  given  probability  of  error 
changes  by  less  than  3  dB  for  most  of  the  space  (—0.9  <  cos0  <  0)  but  diverges  at 
endhre  (—1  <  cos0  <  —0.9).  Another  featnre  of  Figure  3.1.2  is  that  a  low  probability 


66 


10 


of  error  is  achieved  almost  everywhere  even  with  a  moderate  SNR.  Specifically,  less 
than  10%  probability  of  error  is  achieved  over  90%  of  cosine-space  for  a  very  weak, 
3  dB  source. 

The  first  question  raised  by  the  good  left/right  resolution  in  Figure  3.1.2  is  how 
the  behavior  scales  with  frequency  or  the  number  of  sensors.  Figure  3.1.3  displays  the 
same  probability  of  error  contours  for  a  different  scenario  at  /  =  1/7  fd  and  with  only 
N  =  3  sensors.  One  might  expect  the  performance  to  suffer  because  of  the  low  spatial 
resolution  of  such  an  array,  but  it  appears  unchanged  from  the  original  Figure  3.1.2. 
Intuitively,  the  ability  of  vector-sensors  to  resolve  pressure  ambiguities  depends  only 
on  the  directional  sensors.  The  performance  measured  in  the  figures  does  not  change 
because  the  response  of  the  directional  sensors  does  not  change  with  frequency  or 
number. 

The  concept  of  statistical  divergence  provides  much  more  rigorous  insight  into  the 
classihcation  bound.  Loosely  speaking,  divergences  measure  the  dissimilarity  between 
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Figure  3.1.3:  Probability  of  error  contours:  /  =  ^fd,  N  =  3 


two  distributions.  In  this  case,  divergences  quantify  the  left/right  information  mea¬ 
sured  by  the  vector-sensor  array.  Intuitively,  the  more  information  provided  by  the 
array,  the  lower  the  probability  of  error.  One  divergence  that  often  arises  in  informa¬ 
tion  theory  and  binary  hypothesis  testing  is  the  Kullback-Leibler  (K-L)  divergence 
[19,  §12.7  -  12.9].  The  K-L  divergence  between  two  probability  distributions,  po{x) 
and  pi{x),  is 

D(po||Pi)=/  Po{x)\n^^^^dx.  (3.1.24) 

J —oo  Pl\X  ) 

The  K-L  divergence  takes  a  simple  form  for  the  zero-mean,  complex  Gaussian  distri¬ 
butions  considered  here: 


D(Polbi) 


K 

~2 


In 


|Ri| 

|Rnl 


+  tr  (Rb^Ro)  -  4iV 


(3.1.25) 


Recall  that  K  is  the  number  of  independent  observations  and  N  is  the  number  of 
vector-sensors.  Under  the  weak  condition  that  the  noise  is  left/right  symmetric  (see 


Appendix  A.l),  the  K-L  divergence  simplifies  to 


D(Polbi)  =  Y  [tr  (R^  iRo)  -  4iV] 


(3.1.26) 


because  |Ri|  =  |Ro|.  In  this  case,  the  K-L  divergence  is  symmetric  with  respect  to 
its  arguments,  D(po|bi)  =  h)(pi|bo),  and  proportional  to  the  J-divergence  used  in 
[17].  For  the  case  of  white  noise,  R*  =  I  -|-  .  Applying  the  matrix  inversion 

lemma  to  R^^  yields 


D(Polbi)  =  y<itr 


Vi  V 
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^  +  vfvi 


I  +  CT^VoV^) 


4iV 


=  -<tr 


I  -I-  (T^VnV,^ 


Vi  V 
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A  I  v^v 
2  -h 


a^VivfvQV^ 


AN 


=  —iAN  +  2Na^ 
2  ^ 


2N 


a 


\  +  2N  4  +  2iV 


Ivfvol^ 


AN}.  (3.1.27) 


Recalling  the  work  in  [1,  §2.2.2], 


vf  Vq  =  2N  cos^  0.  (3.1.28) 

Substituting  into  Equation  3.1.27  and  collecting  terms  yields  the  simple  expression 

D(Polbi)  =  (l -cos^0)  (3.1.29) 

where 

7  =  2Na^  (3.1.30) 

is  the  array  SNR.  The  K-L  divergence  is  a  function  of  only  cosine  angle,  array  SNR, 
and  number  of  observations;  it  does  not  depend  on  frequency  or  number  of  sensors. 
Just  as  each  choice  of  independent  variables  has  an  associated  probability  of  error, 
each  also  has  an  associated  divergence.  Contours  of  the  divergence,  analogous  to  the 
probability  of  error  contours  plotted  in  Figures  3.1.2  and  3.1.3,  are  given  by  curves 
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cos  (p 

Figure  3.1.4:  Divergence  contours:  /  =  ^fd,  iV  =  10 
of  constant  value,  D: 

Parameterizing  these  curves  as  functions  of  cos  0  involves  solving  the  quadratic  Equa¬ 
tion  3.1.31  for  a  positive  7  to  get 

1  D  I  [dV 

7= - ^  —  +  \i  —  ]  +2—  1-COS4  0  .  3.1.32 

’  1-COS4  0  K  \j  \K J  ^  ’ 

A  carefully  chosen  set  of  these  divergence  contours  are  shown  in  Figure  3.1.4.  When 
the  number  of  observations  is  held  constant,  contours  of  equal  divergence  correspond 
exactly  to  contours  of  equal  error  probability  in  Figures  3.1.2  and  3.1.3.  A  proof  of  this 
conjecture  is  given  in  Appendix  A.l.  Although  Equation  3.1.32  has  a  much  simpler 
form  than  the  probability  of  error,  it  still  captures  the  important  aspects  of  left/right 
resolution  with  vector-sensor  arrays.  Namely,  it  formally  proves  that  the  left/right 
resolution  is  independent  of  both  frequency  and  number  of  sensors  and  is  uniform 
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Figure  3.1.5:  Probability  of  error  contours:  /  =  |/rf,  iV  =  10,  A  0.05 


over  most  of  cosine  space. 

The  results  shown  above  indicate  that  linear  vector-sensor  arrays  are  theoreti¬ 
cally  effective  at  determining  left/right.  However,  to  demonstrate  that  vector-sensor 
performance  is  robust,  he.  not  reliant  on  point  nulls  or  “supergain”,  the  analysis  is 
repeated  for  spatially  spread  sources.  Recalling  the  discussion  of  spatial  spreading  in 
Section  2.4,  a  covariance  matrix  taper  is  used  to  approximate  a  uniformly  distributed 
source.  Returning  to  the  iV  =  10  element  array  at  /  =  5/7/^  from  Figure  3.1.2,  the 
same  source  is  spread  over  1/6  of  a  beamwidth  0.05  in  cosine-space)  to  obtain  Fig¬ 
ure  3.1.5.  Comparing  the  two  hgures  reveals  that  distributing  the  source  has  a  minor 
effect  on  the  left /right  performance.  Intuitively,  the  null  that  allows  for  left /right 
performance  is  determined  by  the  directional  sensors  and  is  already  very  wide  (see 
[1,  Ch.  2]);  therefore,  it  is  relatively  robust.  The  results  in  Figure  3.1.5  suggest  that 
vector-sensor  left/right  performance  is  robust  in  theory. 
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Linear  VSA 


vi 

Ri 


Vo  =  v((/)) 

Ro  =  VqV^ 


R  —  ftoRo  A  CtlRl  CKnRn 
Estimate  unknowns  {ao,  «!,  an}- 


Figure  3.2.1:  Example  right/left  power  estimation  scenario 

3.2  Novel  Estimation-Based  Bound 

Although  the  classification  bound  derived  in  the  Section  3.1  provides  insight  into  the 
left/right  information  measured  by  a  linear  vector-sensor  array,  it  relies  on  the  un¬ 
realistic  assumption  that  the  source  power  is  known.  A  more  realistic  passive  sonar 
scenario  involves  power  estimation  as  illustrated  in  Figure  3.2.1.  In  this  scenario, 
the  objective  is  estimating  the  powers  of  sources  on  either  side  of  the  array  and  the 
power  of  the  background  noise.  Both  sources  yield  identical  pressure  measurements, 
so  any  ability  to  resolve  differences  in  power  arises  from  the  directional,  vector-sensor 
elements.  Assuming  zero-mean  complex  Gaussian  distributions,  the  unknown,  deter¬ 
ministic  parameters  are  the  three  powers  {ao,  ai,  an}]  the  known  parameters  are  the 
azimuth  angles  ±0  and  noise  covariance  R„.  The  derivation  does  not  restrict  the  form 
of  the  noise  covariance,  but  the  analysis  focuses  on  white  noise.  As  with  the  previous 
problem,  the  K  array  measurements  are  summarized  by  the  sample  covariance  matrix 
R. 

The  estimation  problem  in  Figure  3.2.1  is  closely  related  to  typical  array  process¬ 
ing.  Passive  sonar  systems  often  estimate  power  at  many  hypothesized  directions 
and  frequencies,  displaying  these  power  estimates  to  trained  technicians.  The  power 
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distribution  across  angle  and  frequency  is  essentially  a  two-dimensional  spectrogram, 
so  the  results  here  are  directly  related  to  spectrogram  estimation  and  periodograms 
(see  [5,  20]).  Many  aspects  of  this  problem  also  relate  to  the  rejection  of  interference 
or  jamming. 

For  the  estimation  problem  in  Figure  3.2.1,  a  number  of  performance  bounds  exist 
[21].  They  include  the  Chapman- Robbins  or  Barankin  bound,  Bhattacharyya  bound, 
and  Cramer-Rao  bound.  For  random,  unknown  parameters,  others  bounds  exist 
which  include  the  Weiss- Weinstein,  Ziv-Zakai,  and  Bayesian  Cramer-Rao  bounds. 
The  Cramer-Rao  bound  is  chosen  here  because  the  goal  is  providing  insight  into  a 
novel  problem,  not  obtaining  the  tightest  and  most  complex  bound.  The  Cramer-Rao 
bound  in  this  section  shares  only  its  name  with  those  in  [1,  2,  4,  6];  the  underlying 
problems,  derivations,  and  results  are  fundamentally  different.  A  good  introduction  to 
the  Cramer-Rao  bound  and  its  application  to  standard  measures  of  array  performance 
is  [5]. 

The  Cramer-Rao  bound  for  a  parameter  vector  6  states  that  the  error  covariance 
of  any  unbiased  estimate  6  obeys 

E{(0-0)(0-0)^}  ^  J(6>)-\  (3.2.1) 

where  J(0)  is  the  “Fisher  information  matrix.”  Each  entry  of  the  Fisher  information 
matrix  is  given  by 

=  (3.2.2) 

The  matrix  inequality  above  has  several  equivalent  and  intuitive  interpretations 

A  ^  B  A  -  B  is  positive  semidehnite  (3.2.3) 

x'^Ax  >  x^Bx  Vx.  (3.2.4) 

A  consequence  of  Equation  3.2.1  is  that  the  error  variance  of  a  scalar  estimate  di  is 
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bounded  by  a  diagonal  element  of  the  same  inverse, 


E{(9,-9y}>  [J(e)-‘]...  (3,2,5) 

For  the  estimation  problem  in  this  section,  the  vector  of  unknown  parameters  consists 
of  source  and  noise  powers, 

0  =  [ao  ai  anf.  (3.2.6) 

Using  the  Cramer-Rao  bound,  this  section  determines  how  much  a  signal  on  one  side 
of  a  vector-sensor  array  interferes  with  power  estimation  on  the  opposing  side. 


3.2.1  Derivation 

This  section  derives  a  Cramer-Rao  lower  bound  for  the  power  estimation  problem 
outlined  above.  The  derivation  is  carried  out  for  an  arbitrary  noise  covariance  and  is 
much  more  compact  than  that  of  the  classihcation  bound  in  Section  3.1.  This  brevity 
is  partly  because  the  resulting  expression  is  not  easily  simplihed  even  in  special  cases. 

The  Fisher  information  matrix  is  hrst  derived  element-by-element.  The  log- 
likelihood  function  for  zero-mean,  complex  Gaussian  measurements  is  given  in  Equa¬ 
tion  3.1.7.  Entries  of  the  associated  Fisher  information  matrix,  J(0),  have  a  very 
simple  form, 

[3(9)1,  =  (3,2,7) 

Because  each  term  is  scaled  by  the  number  of  observations,  K,  this  derivation  focuses 
on  the  single  observation  case  without  loss  of  generality.  The  covariance  matrix,  R, 
is  a  function  of  the  three  unknown  powers, 

R  =  ttoRo  +  ttiRi  -|-  OinRn 

=  ttoVoV^  -F  ttiv^vf  -h  a„R„.  (3.2.8) 

From  this  linear  combination,  the  necessary  partial  derivatives  for  the  three  unknown 
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parameters  are  easy  to  compute 


dR 

dao 

OR 

da  I 

OR 


V 

(3.2.9) 

V 

ViVi 

(3.2.10) 

R-n* 

(3.2.11) 

The  (1,2)  term  in  the  Fisher  information  matrix  involves  the  two  sources  and  is  a 
convenient  place  to  begin: 


[J(0)]i2  =  tr(R  ^VgV^R  V^vf) 
=  tr  (vf  R“^VgV^R“^V;^) 


_  rf-p— 1  2 

-  Vg  R  . 


(3.2.12) 


The  hrst  step  above  (and  many  steps  to  come)  uses  the  identity  tr(AB)  =  tr(BA). 
Equation  3.2.12  extends  to  any  term  involving  only  the  two  sources.  Moving  to  the 
(1,  3)  term  involving  one  source  and  the  noise, 

[J(0)]i3  =  tr  (R-'vgV^R-iR„) 


=  v^R  ^R„R  Vg. 


(3.2.13) 


Equation  3.2.13  also  extends  to  the  other  source.  The  diagonal,  (3,3)  term  involves 
only  the  noise  power: 

[J(0)]33  =  tr  (R-iR„R-iR„)  .  (3.2.14) 

Unlike  the  other  terms,  Eqnation  3.2.14  is  not  easily  simplihed  by  eliminating  the 
matrix  trace  function. 


Enough  work  exists  now  to  write  the  fnll  Fisher  information  matrix.  The  Fisher 
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information  matrix  for  this  three-parameter  case  is  best  viewed  in  block  form, 


J(0)  = 


J 

J 


SS 


^  sn 


1 


T 

sn 


(3.2.15) 


where  is  the  2x2  matrix  involving  only  the  two  sources,  Jsn  is  the  2x1  vector  in¬ 
volving  the  sources  and  noise,  and  Jnn  is  a  scalar  involving  only  the  noise.  Condensing 
the  results  derived  in  the  previous  paragraph. 


=  (V^R-^V)  0  (V^R-^V)* 

(3.2.16) 

Jsn 

=  diag  (V^R-^R„R-^V) 

(3.2.17) 

Jnn 

=  tr  (R~^R„R“^R„)  , 

(3.2.18) 

where  the  diag(-)  function  extracts  the  main  diagonal  of  a  matrix  as  a  column  vector, 
O  is  the  Hadamard  or  element-wise  product,  *  denotes  conjugation,  and 


V  =  [vo  vi]. 


(3.2.19) 


Each  evaluation  of  the  Fisher  information  matrix  involves  several  matrix  products 
and  a  matrix  inverse.  Although  the  inverse  is  computed  efficiently  via  the  matrix 
inversion  lemma,  the  expressions  are  already  in  their  most  simplified  analytical  form. 


3.2.2  Analysis 

The  derivation  in  the  previous  section  is  complete  and  compact,  but  it  is  not  fully 
satisfying  for  two  reasons.  First,  its  interpretation  is  not  immediately  clear.  Each 
choice  of  independent  variables  yields  a  matrix  inequality  that  is  not  intuitive.  Second, 
its  dependencies  are  not  obvious.  It  is  difficult  to  discern  from  the  expressions  how 
the  bound  changes  with  frequency,  number  of  sensors,  etc.  This  section  addresses 
both  points,  providing  a  useful  visualization  of  the  bound  and  an  approximation  that 
clarifies  the  dependencies. 
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The  first  analysis  problem  is  interpreting  the  bound.  A  clear  interpretation  re¬ 
quires  1)  reducing  the  large  parameter  space  to  a  small  region  of  interest,  and  2) 
obtaining  an  intuitive  metric  from  the  covariance  matrix  inequality.  The  following 
paragraphs  describe  one  interpretation  of  the  Cramer-Rao  bound  and  illustrate  the 
bound  with  several  examples. 

The  large  parameter  space  makes  it  difficult  to  interpret  the  Cramer-Rao  bound. 
However,  examining  a  reduced  parameter  space  delineates  the  a  clear  region  of  inter¬ 
est.  Assuming  other  parameters  are  hxed,  the  space  of  all  three  unknown  parameters 
{ao,ai,an}  provides  a  useful  context  for  interpreting  the  bound.  Symmetry  in  the 
problem  and  practical  considerations  suggest  there  are  only  three  distinct  regions  as 
determined  by  the  strength  of  the  sources  relative  to  the  noise.  In  the  hrst  region, 
both  sources  dominate  the  noise.  This  high-SNR  region  is  uninteresting  because  1) 
sources  of  interest  are  often  weak  for  passive  sonar  and  2)  simple  algorithms  exist  that 
perform  well  in  the  absence  of  noise.  In  the  second  region,  the  noise  dominates  one 
or  more  source.  This  region  is  equally  uninteresting;  reliably  estimating  the  power 
of  such  weak  sources  is  extremely  difficult.  The  third  region  includes  most  cases  of 
interest  and  is  characterized  by  noise  power  on  the  same  order  as  at  least  one  source. 
This  region  of  interest  is  fully  explored  for  the  white  noise  case  when  an  ~  2NaQ 
and  is  swept  from  ai  ao  to  ai  3>  ao.  The  factor  of  2N  accounts  for  the  array 
gain.  Under  this  scenario,  the  source  ao  is  treated  as  a  “target”  and  ai  as  a  “jam¬ 
mer.”  The  goal  is  resolving  the  true  power  of  the  target  regardless  of  the  jammer 
power.  The  entire  region  of  interest  consists  of  only  three  regimes:  the  no  jammer 
regime  (ai  ao),  the  weak  jammer  regime  (ai  ~  ao),  and  the  strong  jammer  regime 
(ai  3>  ao). 

Having  reduced  the  parameter  space  to  a  reasonable  size,  the  remaining  difficulty 
is  interpreting  the  covariance  matrix  inequality.  Keeping  with  the  “target  and  jam¬ 
mer”  interpretation,  the  wanted  parameter  is  the  target  power  ao;  the  other  powers, 
{ai,a„},  are  nuisance  parameters.  The  error  variance  of  the  wanted  parameter,  ao, 
is  bounded  below  by  the  (1, 1)  term  of  J(0)“^  (see  Equation  3.2.5).  A  useful  quantity 
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cos  (p 


Figure  3.2.2:  Left/right  power  estimation  error:  /  =  |/rf,  iV  =  10 


that  summarizes  this  variance  is  the  normalized  root  mean  square  error  (NRMSE), 


NRMSE  = 

A/var(ao) 

Oq 

(3.2.20) 

CRB(NRMSE)  = 

(3.2.21) 

NRMSE  > 

CRB(NRMSE), 

(3.2.22) 

where  So  is  any  unbiased  estimate  of  oq  and  CRB(-)  denotes  the  Cramer- Rao  bound. 
Figure  3.2.2  plots  curves  of  CRB(NRMSE)  versus  azimuth  cosine  for  the  three  regimes. 
The  standard  iV  =  10  element  vector-sensor  array  at  /  =  5/7 fd  is  used  with  iF  =  1, 
Rn  =  I,  oo  =  l/(2fV),  and  q;„  =  1.  A  high  CRB  indicates  that  the  jammer  irrevo¬ 
cably  corrupts  estimates  of  the  target  power.  The  most  interesting  aspect  of  Figure 
3.2.2  is  that,  as  with  the  classihcation  bound  derived  in  Section  3.1,  the  predicted 
VSA  performance  is  uniformly  good  over  most  of  cosine  space.  Specihcally,  the  CRB 
changes  by  less  than  3  dB  over  90%  of  space  (—0.9  <  cos0  <  0)  but  diverges  at 
endhre  (—1  <  cos0  <  —0.9).  As  with  the  classihcation  bound  in  Section  3.1,  good 
performance  is  predicted  almost  everywhere  with  even  a  weak  source. 
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Figure  3.2.3:  Left /right  power  estimation  error:  f  =  N  =  3 

As  with  the  classihcation  bound,  the  results  in  Figure  3.2.2  do  not  change  signif¬ 
icantly  with  frequency  or  number  of  sensors.  Figure  3.2.3  displays  the  same  curves 
for  an  array  with  much  lower  resolution:  N  =  3  and  /  =  1/7 fd-  Comparing  Figure 
3.2.2  to  Figure  3.2.3  reveals  negligible  differences,  suggesting  again  that  the  left/right 
resolution  is  an  inherent  capability  of  the  directional  sensors  and  is  almost  unaffected 
by  their  number  or  spacing.  Furthermore,  the  Cramer-Rao  bound  is  proportional  to 
y/K,  so  changing  the  number  of  observations  only  shifts  the  curves  in  log-space  and 
does  not  affect  the  conclusions. 

The  vector-sensor  performance  predicted  by  the  bound  also  seems  to  be  robust. 
As  is  done  for  the  classification  bound  in  Figure  3.1.5,  uniform  spatial  spreading  is 
introduced  via  a  covariance  matrix  taper  in  Figure  3.2.4.  Distributing  the  sources 
increases  the  bound  only  slightly  and  does  not  change  the  conclusion  that  vector¬ 
sensor  performance  is  uniformly  good  over  most  of  space.  As  with  the  classihcation 
bound,  the  wide  null  placed  by  the  directional  sensors  appears  to  provide  robust 
left/right  discrimination  without  relying  on  “supergain.”  Introducing  a  covariance 
matrix  taper  requires  modifying  the  derivation  in  Section  3.2.1.  The  bound  is  not 
re-derived  here  for  brevity  and  because  the  terms  do  not  simplify  beyond  Equation 
3.2.7. 
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Figure  3.2.4:  Left/right  power  estimation  error:  /  =  |/rf,  iV  =  10,  A  ^  0.05 


The  second  analysis  problem  is  formalizing  the  dependencies  of  the  Cramer-Rao 
bound.  Figures  3.2.2  and  3.2.3  suggest  that  VSA  performance  does  not  depend  on  the 
number  of  sensors  or  the  frequency.  The  following  analysis  proves  this  independence 
in  the  strong  jammer  regime,  Oi  — )■  cxd.  The  strong  jammer  regime  is  intuitively  the 
most  difficult,  reflecting  the  “worst-case”  VSA  performance.  In  the  above  limit, 

R-^- - ^0,  (3.2.23) 

OOt\ 


so  the  terms  in  the  Fisher  information  matrix  dealing  with  the  jammer  go  to  zero. 
Thus,  the  jammer  can  be  treated  as  deterministic  and  removed  from  the  matrix  (see 
[5,  Example  8.11.2]).  The  CRB(NRMSE)  depends  only  on  the  (1, 1)  term  of  J(0)“^, 
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where  =  1  is  assumed.  Applying  the  matrix  inversion  lemma  to  simplihes 
Equation  3.2.24.  The  white  noise  scenario  with  R„  =  I,  ao  =  1/(2A^),  and  «„  =  1 
yields  an  intuitive  and  simple  result.  Writing  the  covariance  matrix, 


R  =  I  +  [vo  vi] 


ao  0 
0  a\ 


[vo 


(3.2.25) 


the  inverse  is 


R  =  I  -  [vo  Vi] 


l/cko  0 

0  l/ai 


+  [vo  vi]^[vo  Vi]  [vo  Vi]^.  (3.2.26) 


Recall  that  the  inner  product  v^Vq  is  given  in  Equation  3.1.28.  Taking  the  limit 
ai  — )■  cxD,  substituting  for  ao,  and  using  some  algebra  gives 


R-'  =  I  -  [vo  vi]  2N 


2  cos^  0 

COS^  0  1 


-1 


[vo  Vi]^.  (3.2.27) 


Applying  this  inverse  to  the  target  replica  vector  yields  a  simple  expression  for  R  ^Vq: 


R  ^Vo 


Vo  -  [vo  Vi] 


-1 

2N 

J 

2N  cos^  0 

Vo  -  [vo  Vi] 


-1 

p 

-1 

2 

cos^  4 

) 

1 

COS^  0 

1 

cos^  4 

) 

[vo  Vi] 

1 

—  COS^  0 

1 

2  —  cos"^  0 

—  COS^  0 

2 

COS^  0 

[vq  Vi] 

2  —  cos"^  0 


Vo  —  Vi  COS^  0 

2  —  cos"^  0 


(3.2.28) 
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This  expression  yields  two  of  the  three  terms  required  in  Equation  3.2.24: 


2N  —  2N  cos"^  0 
2  —  cos^  0 

1  —  COS^  0 


=  2N 


=  2N 


2  —  cos^  0 

2N  —  4iV  cos"^  0  +  2N  cos^  > 

(2  —  cos^  0)2 

1  —  COS^  0 

(2  —  COS^  0)2  ’ 


(3.2.29) 


(3.2.30) 


The  remaining  term,  the  trace  of  R~^,  comes  by  inspection  of  the  eigenvalues  of 
R“^.  In  the  above  limit,  R“^  has  one  zero  eigenvalue  and  one  non-unity  eigenvalue; 
the  remaining  4iV  —  2  eigenvalues  are  unity.  The  non-unity  eigenvalue  arises  from 
the  noise  combined  with  the  component  of  the  target  orthogonal  to  vi  and  is  thus 
[1  -t-  (1  —  cos^0)]“^.  The  trace  is  therefore 


tr  (R-2)  =  (4iV  -  2)  +  (2  -  cos^  0)-^ 

(4iV-2)(2-cos^  0)^  +  1 
(2  —  cos"^  0)2 


Substituting  terms  into  Equation  3.2.24  gives 


{2NY 


(1  —  COS^  0)2 
(2  —  COS^  0)2 


(4jV-2)(2-cos4  0)2  +  1 
(2— cos'^ 


(2iV)2(l 


(2  —  COS^  0)2 


COS^  0)2 


1 


_ 1 _ 

(4Af-2)(2-cos4  0)2+1 


Noticing  that 


1 

(4iV-2)(2-cos4  0)2  +  1 


(3.2.31) 


(3.2.32) 


(3.2.33) 
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for  even  a  small  number  of  sensors,  Equation  3.2.21  is  well-approximated  by 


CRB(NRMSE) 


2  —  cos"^  0 
1  —  cos"^  0 


=  1  + 


1 

1  —  COS^  0 


(3.2.34) 


Furthermore,  the  above  approximation  remains  a  valid  lower  bound  because  it  is 
always  less  than  the  CRB.  Equation  3.2.34  is  a  key  result,  as  it  verihes  analytically 
that  in  the  strong  jammer  regime,  the  normalized  Cramer- Rao  bound  does  not  depend 
on  frequency  and  only  very  weakly  on  the  number  of  sensors.  It  predicts  with  a  high 
degree  of  accuracy  the  “Strong  Jammer”  curves  in  Figures  3.2.2  and  3.2.3. 


3.3  Conclusions  and  Intuition 

This  chapter  isolates,  measures,  and  bounds  the  VS  A  ability  to  discriminate  signals 
that  are  ambiguous  with  a  PSA.  Section  3.1  explores  the  problem  of  binary  classi- 
hcation  when  the  signal  power  is  known;  Section  3.2  explores  the  problem  of  joint 
estimation  when  the  powers  are  unknown.  Although  the  two  scenarios  are  distinct, 
they  lead  to  similar  conclusions  and  provide  intuition  about  VSA  performance. 

The  results  prove  that  theoretical  VSA  ambiguity  resolution  is  1)  good  everywhere 
except  array  endhre  and  2)  independent  of  the  number  of  sensors  or  the  analysis 
frequency.  Generally,  bounds  only  prove  that  good  performance  may  be  achievable. 
However,  the  bounds  developed  in  this  chapter  are  tight.  The  classihcation-based 
bound  in  Section  3.1  is  constructive,  i.e.  its  derivation  involves  a  likelihood  ratio  test 
which  achieves  the  minimum  probability  of  error.  The  estimation-based  bound  in 
Section  3.2  is  asymptotically  tight.  Maximum  likelihood  estimators  for  this  type  of 
problem  are  asymptotically  efficient  ([5,  §8.5]),  approaching  the  Cramer-Rao  bound  as 
the  number  of  observations  approaches  inhnity  {K  — )■  cxd).  Because  both  bounds  are 
tight,  they  prove  that  good  performance  is  achievable  under  the  scenarios  described 
in  Sections  3.1  and  3.2. 
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Shared  elements  of  the  analysis  also  provide  intuition  about  linear  VSA  capabili¬ 
ties.  Equations  3.1.29  and  3.2.34  reveal  that  both  performance  bounds  are  a  function 
of  the  same  quantity,  the  left/right  rejection  of  a  single  vector-sensor: 

C(0)  =  cos^0.  (3.3.1) 

Ambiguity  resolution  stems  from  the  directional  ability  of  each  vector-sensor,  so  the 
behavior  of  one  sensor  provides  intuition  about  the  behavior  of  an  array.  Just  as  with 
a  single  vector-sensor,  the  ability  of  a  VSA  to  resolve  pressure  ambiguities 

•  Does  not  change  with  the  analysis  frequency 

•  Is  good  except  near  endhre  (where  ({(f)  ~  1) 

•  Is  robust  (because  left/right  nulls  are  wide). 

According  to  the  same  principle,  the  number  of  vector-sensors  only  affects  left/right 
performance  through  the  array  gain. 
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Chapter  4 


Fixed  Weight  Design  for  Uniform 
Linear  Vector-Sensor  Arrays 


Chapters  2  and  3  explore  the  properties  and  performance  limits  of  vector-sensor  ar¬ 
rays.  Chapter  3  predicts  that  vector-sensor  ambiguity  resolution  is  very  good  almost 
everywhere  in  theory.  The  question  remains,  however:  is  this  performance  achievable 
in  practice? 


Building  with  the  tools  outlined  in  Chapter  2,  this  chapter  designs  hxed  weights 
for  linear  vector-sensor  arrays.  Designing  “good”  hxed  weights  for  such  arrays  means 
balancing  the  competing  objectives  of  low  sensitivity  to  modeling  errors,  a  narrow 
beamwidth,  and  low  sidelobe  levels.  After  surveying  and  categorizing  existing  tech¬ 
niques,  this  chapter  proposes  the  use  of  offline  convex  optimization  for  beampattern 
design.  The  techniques  described  in  Section  4.3.3  design  robust,  hxed  weights  for 
efficient  non-adaptive  processing.  In  many  scenarios,  these  weights  achieve  good  per¬ 
formance  like  that  predicted  in  Chapter  3.  Each  modihed  beampattern  in  the  chapter 
is  compared  to  the  “original,”  or  conventional,  beampattern. 
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4.1  Designs  Using  the  Array  Modnlation  Theorem 


(AMT) 


Good  linear  PSA  beampatterns  are  easily  achieved  for  every  “look  direction”  by  ap¬ 
plying  a  well-designed  spatial  taper.  Section  2.3.2  shows  that  this  is  not  possible  with 
linear  vector-sensor  arrays,  but  a  related  technique  illustrates  the  use  of  spatial  tapers 
in  VS  A  processing.  Imposing  structure  on  the  weights  enables  VS  A  beampattern  de¬ 
sign  using  the  array  modulation  theorem.  Structuring  the  weights  yields  an  intuitive 
but  less  flexible  technique. 

4.1.1  The  Array  Modulation  Theorem  and  VSAs 

The  array  modulation  theorem  provides  a  useful  visualization  of  spatial  tapering  ap¬ 
plied  to  vector-sensor  arrays.  It  introduces  “pattern  multiplication,”  which  simplihes 
certain  designs  by  factoring  the  beampattern  into  two  intuitive  terms.  This  subsec¬ 
tion  introduces  the  array  modulation  theorem  and  illustrates  its  application  to  linear 
vector-sensor  arrays. 

The  array  modulation  theorem  states  that  the  beampattern  for  an  array  of  iden¬ 
tical,  directional  sensors  is  the  beampattern  assuming  omnidirectional  sensors  mod¬ 
ulated  by  the  response  of  the  directional  sensor.  Because  the  sensor  response  is  the 
same  for  each  element,  it  factors  out  of  the  beampattern  summation.  Proof  of  this 
factorization  is  provided  in  [5,  §2.8]  for  the  general  case  and  in  [1,  §2.1]  for  vector¬ 
sensor  arrays.  A  key  restriction  of  the  array  modulation  theorem  is  that  the  responses 
of  each  sensor  element  must  be  identical. 

Applying  the  array  modulation  theorem  to  vector-sensor  arrays  is  not  obvious 
because  the  VSA  contains  four  types  of  sensors  with  different  responses.  To  apply 
the  theorem  with  “look  direction”  0o,  consider  weighting  the  vector-sensor  by 

=  p  G  (4.1.1) 


where  p  forms  the  same  linear  combination  from  every  vector-sensor  and  applies 


Figure  4.1.1:  Weighting  scheme  for  which  the  array  modulation  theorem  applies 


a  unique  scale  factor  to  each.  Figure  4.1.1  illustrates  such  a  structured  weighting 
scheme.  The  4x1  vector  p  applies  the  same  weighting  to  each  vector-sensor,  forming 
N  identical  “pseudo-sensors.”  The  iV  x  1  spatial  taper  t  =  [ti  t2  ■  ■  ■  forms 

a  beampattern  from  these  pseudo-sensors.  Because  the  pseudo-sensors  are  identical, 
the  array  modulation  theorem  implies  that  the  VSA  beampattern  is  the  beampattern 
of  the  taper  modulated  by  the  response  of  the  pseudo-sensor. 

4.1.2  AMT  Beampattern  Design 

The  array  modulation  theorem  enables  the  design  of  interesting  beampatterns  which 
combine  spatial  tapering  with  pseudo-sensor  nulling.  Some  of  these  beampatterns 
are  explored  in  [3,  4].  The  remainder  of  this  section  provides  example  beampatterns 
which  use  the  pseudo-sensor  response  to  null  the  pressure-sensor  ambiguity  and  the 
spatial  taper  to  control  sidelobes.  For  comparison,  all  examples  are  with  0o  = 
iV  =  10  elements,  and  /  =  5/7/^. 

A  naive  initial  approach  places  the  minimum  of  the  pseudo-sensor  response  at  zero 
in  the  direction  of  the  pressure-sensor  ambiguity,  effectively  hxing  the  pseudo-sensor 
beampattern  and  its  derivative  at  a  point.  Assuming  a  three-dimensional  vector- 
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Beampatterns 


Angle  (radians) 

Figure  4.1.2:  AMT  beampattern:  second-order  constraints,  uniform  taper 


sensor  gives  the  pseudo-sensor  weighting 


p  oc 


—  cos  00 
sin  00 


(4.1.2) 


With  a  uniform  spatial  taper,  this  weight  yields  modihed  beampatterns  like  the  ex¬ 
ample  shown  in  Figure  4.1.2.  The  stringent,  second-order  constraints  elevate  the 
pseudo-sensor  response  away  from  the  null.  This  elevated  pseudo-sensor  response 
raises  the  sidelobes  of  the  VS  A  beampattern  to  signihcant  levels  and  increases  the 
sensitivity  factor  of  the  weight  by  6.02  dB,  suggesting  the  weight  is  not  robust.  Re¬ 
placing  the  uniform  spatial  taper  with  a  25  dB  Taylor  taper  improves  the  sidelobe 
structure  as  shown  in  Figure  4.1.3  but  increases  the  sensitivity  factor  to  6.45  dB. 
Furthermore,  the  spatial  taper  does  not  reduce  sidelobes  to  the  desired  level  because 
they  are  modulated  higher  by  the  pseudo-sensor  response. 


Beampatterns 


Angle  (radians) 

Figure  4.1.3:  AMT  beampattern:  second-order  constraints,  25  dB  Taylor  taper 


An  alternative  approach  defines 


h(0)  = 


1 

COS( 

sine 


and  chooses  the  “optimum”  pseudo-sensor  weighting 


p  oc  [1-  ,  ,  )  H4>0) 


h'^(-0o)h(-0o) 

=  h(0o)  -  h(-0o)  cos^  0, 


(4.1.3) 


(4.1.4) 

(4.1.5) 


This  weighting  minimizes  the  sensitivity  factor  subject  to  the  point  null  constraint 
at  —00  and  the  unity  gain  constraint  at  0o-  The  form  of  Equation  4.1.4  highlights 
its  interpretation  as  a  projection  of  the  unmodified  weighting,  h(0o),  into  the  space 
orthogonal  to  h(— 0o)-  The  optimum  weighting  produces  beampatterns  like  the  ex- 
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Figure  4.1.4:  AMT  beampattern;  “optimum”  null,  uniform  taper 


ample  in  Figure  4.1.4.  The  sensitivity  factor  of  the  example  weight  is  elevated  by 
only  1.25  dB  and  the  pressure-sensor  ambiguity  is  reduced  to  a  reasonably  low  level, 
but  the  sidelobes  are  still  higher  than  desired.  Applying  a  25  dB  Taylor  taper  re¬ 
sults  in  the  beampattern  shown  in  Figure  4.1.5.  Although  the  spatial  taper  lowers 
the  sidelobes  to  an  acceptable  level,  the  pressure-sensor  ambiguity  again  becomes  the 
dominant  feature.  The  addition  of  a  spatial  taper  increases  the  sensitivity  factor  by 
1.68  dB  compared  to  the  original,  or  conventional,  weights. 

Starting  from  the  result  in  Figure  4.1.5,  two  hnal  modihcations  are  worth  men¬ 
tioning.  First,  rather  than  constraining  the  null  and  minimizing  the  sensitivity  factor, 
one  could  constrain  the  sensitivity  factor  and  minimize  the  power  at  — 0o-  Apply¬ 
ing  such  a  constraint  makes  the  weights  more  robust  at  the  expense  of  nulling  the 
pressure- sensor  ambiguity.  Because  the  pseudo-sensor  weighting  is  a  small  vector,  the 
sensitivity  constraint  is  computationally  efficient.  Second,  one  could  offset  the  null 
placement  slightly  to  avoid  an  uneven  splitting  of  the  ambiguity.  The  beampattern 
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Figure  4.1.5:  AMT  beampattern:  “optimum”  null,  25  dB  Taylor  taper 


resulting  from  an  “offset”  null  is  illustrated  in  Figure  4.1.6.  Offsetting  the  null  moves 
it  closer  to  00  and  thus  raises  the  sensitivity  factor  slightly  to  1.85  dB. 


4.1.3  Limitations  of  the  AMT  Approach 


Pattern  multiplication  is  simple  and  intuitive,  but  it  does  not  fully  exploit  the  ca¬ 
pabilities  of  a  vector-sensor  array.  Requiring  that  all  vector-sensors  form  identical 
pseudo-sensors  is  overly  restrictive.  Although  some  weights  factor  this  way,  many 
useful  weights  do  not. 

The  weights  designed  using  pattern  multiplication  are  restricted  by  the  shape  and 
sensitivity  of  the  pseudo-sensor  response.  Without  losing  generality,  the  vector-sensor 
weight  is  parameterized  as 


P  = 


a 

-/3  cos  (pm 
-(3  sin  (pm 


(4.1.6) 
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Figure  4.1.6:  AMT  beampattern:  “offset”  null,  25  dB  Taylor  taper 

with  /d  >  0.  With  this  parameterization,  the  shape  of  the  pseudo-sensor  response  is 
restricted  to  the  linearly  transformed  cosine  function 

l/p(0)  =  P^h(0)  (4.1.7) 

=  a  - /3  cos{4>  -  4>m) ,  (4.1.8) 

which  has  a  minimum  at  yp{(t)m)  =  Oi  —  (3  and  a  maximum  at  |/p(7r4-0m)  =  a-f/d.  Thus, 
the  pseudo-sensor  response  has  at  most  two  nulls  and  only  one  minimum.  Requiring 
unity  gain  at  00  leaves  only  two  degrees-of-freedom  for  nulling  the  pressure  ambiguity, 
regardless  of  the  number  of  sensors  in  the  array.  The  sensitivity  factor,  -1-  0^,  is 
constrained  for  robust  weights,  leaving  very  little  freedom  in  the  design. 

A  hnal  critique  of  pattern  multiplication  is  that  it  does  not  directly  address  the 
design  objectives  stated  in  this  work:  low  sensitivity,  a  narrow  beamwidth,  and  low 
sidelobe  levels.  Although  many  of  the  ad  hoc  extensions  to  pattern  multiplication 
address  these  objectives,  the  resulting  weights  are  not  optimal  with  respect  to  any 
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specific  design  criterion. 


4.2  Designs  Using  A  Priori  Noise  Distributions 


Another  large  class  of  algorithms  for  beampattern  design  is  based  on  a  priori  knowl¬ 
edge  of  the  noise  environment.  This  section  mentions  several  such  algorithms  and 
make  the  assumed  noise  distribution  explicit  for  each.  A  novel  result  for  VS  A  spatially 
spread  sources  is  included.  The  section  concludes  with  a  critique  of  the  technique.  As 
before,  all  examples  in  this  section  are  with  an  iV  =  10  element  vector-sensor  array 
at  /  =  5/7 fd,  steered  to  0o  =  — Tr/d. 

If  the  noise  distribution  is  known,  the  “optimum”  choice  of  beamforming  weights 
is  the  solution  to  the  minimum  variance  distortionless  response  (MVDR)  problem; 


minimize  w^Rw 
subject  to  w^vo  =  1 


(4.2.1) 


for  the  noise  covariance  matrix,  R,  and  signal  replica  vector,  vq.  The  linear  constraint 
avoids  distortion  of  signals  in  the  replica  direction;  the  quadratic  objective  minimizes 
the  leakage  of  interference  into  the  beamformer  output.  For  invertible  R  the  weight 
vector  has  a  closed  form. 


R  ^Vo 

W  =  - - . 

For  singular  R,  it  is  convenient  to  dehne  the  MVDR  weights  in  the  limit 


(4.2.2) 


w  =  lim 


(R^+e4)-^vo 
v^(R  +  e2l)-ivo 


(4.2.3) 


The  above  limit  exists  for  any  covariance  matrix,  but  care  must  sometimes  be  taken 
to  ensure  numerical  stability.  Equation  4.2.3  has  many  natural  interpretations:  1)  it 
gives  the  minimum  norm  solution  to  the  under-determined  problem,  2)  the  resulting 
weight  vector  has  the  smallest  sensitivity  factor  of  any  solution,  and  3)  the  case  of  a 
degenerate  Gaussian  distribution  is  treated  properly  as  a  limit. 
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4.2.1  Spatially  Discrete  Sources  and  Point  Nulls 


A  common  and  practical  approach  to  VS  A  beampattern  design  places  a  null  on  the 
pressure- sensor  ambiguity.  The  general  formulation  of  this  problem  includes  a  sensi¬ 
tivity  constraint  or  diagonal  loading.  The  equivalent  a  priori  noise  distribution  is 

R  =  v,vf +  6%  (4.2.4) 


where  Vf,  is  the  signal  replica  of  the  pressure-sensor  ambiguity,  or  “backlobe.”  The 
diagonal  loading  factor,  e^,  arises  in  many  contexts,  often  as  a  regularization  term  or 
as  the  Lagrange  multiplier  for  a  quadratic/sensitivity  constraint. 

Although  this  “point  null”  approach  seems  more  formal  than  the  ad  hoc  designs 
using  pattern  multiplication,  the  two  techniques  are  equivalent.  The  equivalence 
is  shown  by  applying  the  matrix  inversion  lemma  to  Equation  4.2.2  with  the  noise 
distribution  from  Equation  4.2.4: 


w 


[vftvf  -F  e^l]  ^  Vo 


oc 


I_ 


e2  +  2iV 
2N 


Vo 


=  Vo  -  Vfe 


e2  +  2iV 


cos 


(4.2.5) 


Because  the  backlobe  is  a  pressure-sensor /phase  ambiguity,  both  the  signal  replica 
and  the  backlobe  replica  have  a  common  phase  vector,  Vp: 


Vo  =  Vp  (g)  h(4-(/o) 

(4.2.6) 

Vfe  =  Vp{8)h(-0o)- 

(4.2.7) 

For  more  on  the  Kronecker  product  representation  of  the  vector-sensor  array  beam- 
pattern,  see  [1,  Ch.  2].  Substituting  these  expressions  and  using  the  properties  of  the 
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Kronecker  product  gives 


w  oc  [vp  0  h(+0o)]  -  [vp  0  h(-0o)]  (^^2  2N 

=  Vp0p,  (4.2.8) 

where 

p  =  h(+0o)  -  h(-0o)  *^0)  •  (4.2.9) 

Equation  4.2.8  reveals  that  the  optimum  weight  resulting  from  this  a  priori  distribu¬ 
tion  naturally  takes  the  form  of  pattern  multiplication  (Section  4.1.1).  Furthermore, 
the  weight  is  equivalent  to  Equation  4.1.5  in  the  limit  — )■  0. 

Designing  VSA  beampatterns  using  spatially  discrete  sources,  or  point  nulls,  ex¬ 
tends  to  multiple  sources  at  arbitrary  locations.  Such  approaches  may  yield  acceptable 
beampatterns,  but  no  suitable  method  for  determining  the  placement  of  sources/nulls 
is  known.  With  pressure-sensor  arrays,  polynomial  and  Fourier  transform  properties 
analytically  provide  zero/null  placement  for  Chebyshev,  Taylor,  Villeneuve,  and  other 
beampatterns.  With  vector-sensor  arrays,  however,  no  such  analysis  exists. 

4.2.2  Spatially  Spread  Sources  and  Sector  Nulls 

As  shown  in  Figure  4.1.6,  the  point  null  technique  (or  the  equivalent  array  modulation 
technique)  is  unable  to  reduce  the  entire  backlobe  to  a  sufficiently  low  level.  Choosing 
an  effective  set  of  constraint  points  is  difficult  in  the  vector-sensor  array  case,  so 
extending  the  technique  to  multiple  linear  constraints  is  problematic. 

An  alternative  approach  uses  the  novel  results  in  Section  2.4  to  place  a  “sector 
nnll”  in  the  direction  of  the  pressnre  ambiguity.  The  a  priori  noise  covariance  for  this 
approach  is  generally 

R  =  Ri,  +  €%  (4.2.10) 

where  Rb  is  the  covariance  matrix  of  a  spatially  spread  source  located  in  the  backlobe 
direction.  One  of  the  three  techniques  in  Section  2.4  is  easily  used  to  obtain  R5. 
Because  the  covariance  matrix  of  the  spatially  spread  sonrce  is  not  fnll  rank,  some 
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Figure  4.2.1:  Comparison  of  tapered  point  and  sector  null  beampatterns 

amount  of  diagonal  loading  is  required  or  the  limiting  form  in  Equation  4.2.3  must  be 
used.  Figure  4.2.1  demonstrates  that  this  approach  produces  acceptable  beampatterns 
when  tuned  properly.  In  this  case,  even  a  small  amount  of  uniform  spatial  spreading 
((T„  =  4  X  10“^,  about  2.8%  of  the  beamwidth)  and  diagonal  loading  (e^  =  2N  x  10“"^) 
substantially  improves  the  backlobe  null.  A  25  dB  Taylor  taper  is  also  applied  to 
reduce  the  sidelobe  levels.  The  point  and  sector  null  beampatterns  have  sensitivity 
factors  of  1.68  dB  and  1.71  dB,  respectively. 

Unfortunately,  obtaining  good  beampatterns  like  the  one  shown  in  Figure  4.2.1  is 
difficult  in  practice.  Unlike  the  point  null  technique,  no  closed  form  expression  exists 
for  either  the  covariance  matrix  or  the  weight  vector.  Widening  the  backlobe  null 
relies  heavily  on  the  sub-dominant  eigenvectors  of  Rf,,  requiring  burdensome,  high- 
precision,  numerical  integration  for  some  terms  (see  Equation  2.4.7).  Furthermore, 
even  a  careful  implementation  of  the  sector  null  technique  requires  time-consuming 
parameter  tuning  and  necessitates  the  use  a  spatial  taper. 
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4.2.3  Limitations  of  the  A  Priori  Approach 

Section  4.2  presents  a  unified  picture  of  many  fixed  weight  design  techniques  for 
vector-sensor  arrays.  Viewing  these  techniques  in  the  context  of  “a  priori  noise  dis¬ 
tributions”  highlights  the  primary  limitation  common  to  all:  no  technique  explicitly 
addresses  the  key  objectives  of  low  sidelobes,  a  narrow  mainlobe,  and  low  sensitivity. 
As  such,  each  approach  requires  a  patchwork  of  tools  including  diagonal  loading  and 
spatial  tapers  to  meet  the  objectives  of  this  chapter.  In  every  case,  the  parameter 
tuning  is  ad  hoc  and  not  separable.  Like  VSA  pattern  multiplication,  the  a  priori 
approach  yields  weights  that  are  suboptimal  with  respect  to  the  stated  design  objec¬ 
tives. 


4.3  Designs  Using  Convex  Optimization 

Sections  4.1  and  4.2  summarize  many  typical  techniques  for  VSA  hxed  weight  de¬ 
sign.  The  techniques  discussed  share  one  powerful  criticism:  they  do  not  directly 
address  the  stated  objectives  of  VSA  beampattern  design.  Judging  these  techniques 
by  beamwidth,  sidelobe  level,  and  sensitivity  is,  in  a  sense,  asking  for  one  thing  but 
wanting  another.  The  distinction  between  the  implicit  and  explicit  objectives  is  more 
than  philosophical. 

This  section  shows  that  explicitly  optimizing  for  the  design  objectives  yields  three 
benehts.  First,  the  resulting  beampatterns  are  optimal  in  terms  of  one  or  more  ob¬ 
jective.  The  “optimality  gap”  of  existing  techniques  is  sometimes  revealed  to  be 
substantial.  Second,  trade-offs  between  the  multiple  objectives  become  straightfor¬ 
ward.  Third,  no  ad  hoc  parameter  tuning  is  required.  The  only  parameters  are  the 
intuitive  constraints  applied  to  each  design  objective. 

The  techniques  presented  in  this  section  connect  the  fields  of  array  processing  and 
convex  optimization.  Such  a  connection  has  been  noted  before  in  the  optimization 
literature,  the  hlter  design  literature,  and  the  array  processing  literature.  Similar 
techniques  are  applied  to  the  related  problem  of  FIR  hlter  design  in  [23].  In  [24], 
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the  convex  optimization  material  from  [25]  is  applied  to  beampattern  design.  An 
iterative  algorithm  for  beampattern  design  is  also  presented  in  [5,  §3.9.3].  Finally, 
[26]  illustrates  the  broad  scope  of  applied  convex  optimization.  Four  contributions  of 
this  section  are  applying  optimization  techniques  to  vector-sensor  arrays,  studying  the 
design  problem  in  the  appropriate  multi-objective  context,  relating  design  criterion 
to  popular  results  in  pressure-sensor  array  processing,  and  describing  an  efficient 
algorithm  for  designing  weights. 

4.3.1  Spatial  Quantization 

Fixed  weight  design  is  often  complicated  by  the  uncountable  number  of  points  in  the 
beampattern.  Analytical  methods,  such  as  polynomial  approximation  and  sampling 
of  the  Fourier  transform,  reduce  the  complexity  of  pressure-sensor  array  processing  [5, 
§3. 2-3. 3].  Chapter  2  reveals  that  such  methods  generally  do  not  apply  to  vector-sensor 
arrays.  For  vector-sensor  arrays,  spatial  quantization  converts  the  design  problem  into 
a  manageable  form. 

The  smooth  nature  of  the  vector-sensor  array  beampattern  means  it  is  approxi¬ 
mated  arbitrarily  well  by  a  hnite  number  of  sample  points.  Consider  the  important 
problem  of  constraining  a  vector-sensor  array  beampattern  in  some  region.  Section 
2.3  shows  that  the  deviation  of  the  VSA  beampattern  between  two  sample  points  is 
bounded,  so  a  hnite  sampling  exists  that  constrains  the  beampattern  to  any  arbi¬ 
trary  accuracy.  The  local  Fourier  transform  property  suggests  even  more:  because 
the  VSA  beampattern  behaves  like  a  modihed  PSA  beampattern  on  a  a  small  scale, 
the  same  quantization  scheme  should  work  well  with  both  array  types.  Existing  work 
on  FIR  hlter  design  (or  equivalently,  PSA  beampattern  design)  utilizes  a  uniform 
grid  in  discrete-time  frequency  [27].  Relating  this  work  to  the  vector-sensor  array 
problem  suggests  a  dense  grid  of  ~  20V  points  in  cosine-space  (~  10 V  per  side)  for 
a  vector-sensor  array  beampattern.  In  some  cases,  an  equally  dense  grid  in  angular 
space  yields  better  results  near  array  endhre.  The  tolerances  in  hlter  design  are  often 
tighter  than  in  beampattern  design,  so  this  sampling  appears  more  than  adequate  for 
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Figure  4.3.1:  Coarse  spatial  quantization  in  cosine-space 

most  beampattern  design  problems.  For  extremely  tight  tolerances,  exchange  algo¬ 
rithms  similar  to  [23]  obtain  an  exact  solution  by  iteratively  updating  the  quantization 
grid.  The  results  in  this  thesis  use  an  extremely  hne  spatial  quantization  to  ensure 
that  any  deviations  are  negligible.  The  results  also  focus  primarily  on  quantization  in 
the  azimuthal  dimension;  Section  4.3.5  demonstrates  that  the  resulting  beampatterns 
behave  well  in  both  dimensions  (azimuth  and  elevation). 

Figure  4.3.1  illustrates  a  coarse  spatial  quantization  in  cosine-space  for  a  vector¬ 
sensor  beampattern  steered  to  00  =  — vr/4.  The  array  contains  iV  =  10  vector-sensors 
with  /  =  5/7 fd-  The  lone  equality  constraint  forces  unity  gain  in  the  look  direction, 
i.e.  a  distortionless  response.  Upper  and  lower  bounds  on  the  beampattern,  denoted 
by  triangles  in  the  hgure,  are  enforced  at  each  angular  sample  point.  The  quantization 
scheme  partitions  angular  space  into  a  mainlobe  region  and  a  sidelobe  region.^  In 
the  mainlobe  region,  the  beampattern  is  constrained  to  be  no  greater  than  unity  to 

^The  mainlobe  region  in  beampattern  design  relates  to  the  passband  and  transition  regions  in 
FIR  filter  design. 
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avoid  “squinting”  of  the  beampattern.  In  the  sidelobe  region,  the  beampattern  is 
constrained  to  be  no  greater  than  the  desired  sidelobe  level.  In  both  regions,  the 
beampattern  is  constrained  to  be  no  less  than  the  negative  of  the  desired  sidelobe 
level.  The  beampattern  only  becomes  negative  after  the  hrst  null,  so  applying  this 
lower  bound  in  the  mainlobe  region  avoids  bifurcation  of  the  beam.  Together,  these 
constraints  approximate  the  shaded  region  in  the  hgure.  The  coarse  quantization  in 
Figure  4.3.1  uses  only  5N  points,  far  fewer  than  the  suggested  ~  20iV.  Even  with 
such  a  coarse  quantization,  the  beampattern  deviates  very  little  from  the  desired 
(shaded)  region.  Note  that  the  cosine  spacing  is  most  dense  at  broadside  where 
the  beampattern  changes  rapidly  with  angle;  it  is  least  dense  at  endhre  where  the 
beampattern  changes  slowly  with  angle. 


Spatial  quantization  has  a  rich  and  formal  mathematical  background.  Although 
a  full  discussion  is  outside  the  scope  of  this  document,  the  concepts  are  worth  men¬ 
tioning.  The  fundamentals  of  spatial  quantization  are  deeply  rooted  in  differential 
geometry.  Many  problems  in  signal  and  array  processing  involve  inner  products  be¬ 
tween  a  hxed  vector  and  the  set  of  points  on  a  manifold  in  some  high  dimensional 
vector  space.  Examples  include  hlter  design  (the  hxed  vector  is  the  impulse  response 
and  the  manifold  is  the  Fourier  basis)  and  general  beampattern  design  (the  vector 
is  an  aperture  weighting  and  the  manifold  is  the  array  manifold  of  replica  vectors). 
Because  the  manifold  is  smooth,  the  inner  product  is  smooth  and  is  approximated  to 
arbitrary  accuracy  by  sampling.  In  this  context,  1)  the  vector-sensor  array  manifold 
is  a  curve  and  2)  the  cosine-spaced  sampling  selects  points  on  the  curve  with  approx¬ 
imately  equal  geodesic  spacing.  The  constraints  also  have  a  geometric  picture.  Each 
linear  inequality  specihes  a  half-space;  the  collection  specihes  a  high-dimensional  poly¬ 
gon,  or  a  polytope.  An  equality  constraint  specihes  a  plane.  Quadratic  constraints 
(introduced  later)  involve  ellipsoidal  regions.  In  this  context,  spatial  quantization 
is  equivalent  to  approximating  a  desired  region,  or  “feasible  set,”  with  a  polytope. 
Further  treatment  of  spatial  quantization  also  exists  in  the  held  of  semi-inhnite  pro¬ 
gramming. 
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4.3.2  The  Minimax  Criterion 


Because  beampattern  design  is  closely  related  to  filter  design,  applying  the  same 
design  criteria  to  both  problems  seems  logical.  The  powerful  “minimax”  criterion 
is  widely  used  in  both  hlter  design  ([20,  §7.4])  and  PSA  beampattern  design  ([5, 
§3.6]).  Although  the  analytical  techniques  used  in  hlter  design  do  not  apply  with 
vector-sensor  arrays.  Section  2.3  lays  the  foundation  for  numerical  minimax  designs 
using  linear  programming.  This  section  motivates  the  minimax  criterion  in  array 
processing,  applies  it  to  VS  A  beampattern  design,  and  discusses  the  results. 

The  minimax  criterion  arises  in  many  contexts  within  array  processing.  Most  gen¬ 
erally,  the  design  criterion  is  to  minimize  the  maximum  error.  Applying  this  concept 
to  beampattern  design  translates  into  minimizing  the  maximum  sidelobe  level.  The 
minimax  criterion  is  a  natural  choice  when  dealing  with  discrete  interference  or  jam¬ 
ming,  both  common  problems  in  array  processing.  The  “worst-case”  scenario  for  a 
given  beampattern  places  all  interference  at  the  exact  location  of  the  maximum  side- 
lobe,  resulting  in  the  lowest  output  signal-to-noise  ratio.  A  minimax  beampattern  is 
therefore  the  best  worst-case  design.  When  weights  must  be  designed  without  a  priori 
information  about  interference  location  or  power,  the  minimax  criterion  provides  the 
lowest  upper-bound  on  interference  leakage.  The  minimax  criterion  also  arises  as  a 
common  objective  in  approximation  theory.  In  this  context,  the  minimax  beampat¬ 
tern  is  an  “optimal”  approximation  to  the  desired  response.  The  maximum  error  is 
formally  dehned  as  the  norm,  also  known  as  the  Chebyshev  or  uniform  norm. 

Designing  a  minimax  VSA  beampattern  is  equivalent  to  solving  a  real  linear  pro¬ 
gram  (LP).  Proving  this  equivalence  requires  only  three  steps  thanks  to  the  results  in 
Section  2.1  and  4.3.1.  First,  recall  from  Section  4.3.1  that  constraining  the  beampat¬ 
tern  in  the  sidelobe  region  is  equivalent  to  constraining  it  at  hnite  number  of  points. 
A  hxed  spatial  quantization  can  approximate  the  continuous  constraint  to  an  arbi¬ 
trary  accuracy.  By  iteratively  modifying  the  constraint  points,  exchange  algorithms 
hnd  a  quantization  grid  that  yields  the  same  result  as  the  continuous  constraint.  Sec¬ 
ond,  recall  from  Equation  2.1.18  that  the  VSA  beampattern  at  any  point  is  a  real. 
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linear  function  of  the  real  and  imaginary  parts  of  the  weight  vector.  If  the  weight 
is  represented  as  the  real  vector,  w  G  in  the  transformed  space,  the  beampat- 

tern  at  azimuth  angle  0  is  expressed  easily  as  |/(0)  =  v'^(0)w.  Note  that  ^(0)  is 
the  transformed  replica  vector  given  by  the  real  coefficients  (see  Section  2.1.2).  The 
third  and  hnal  step  is  writing  the  minimax  criterion  as  a  standard  linear  program. 
From  the  second  step,  an  upper  bound  on  the  beampattern  at  angle  0  is  expressed  as 
v^((;/))w  <  /3\j.  A  lower  bound  is  similarly  expressed  as  — v^(0)w  <  — /3l-  Utilizing 
the  spatial  quantization  illustrated  in  Figure  4.3.1  results  in  the  minimax  problem 

minimize  S 


subject  to  v'^((;/)o)w 

= 

1 

V^(0m)w 

< 

1 

m  e  M 

(4.3.1) 

V^(0m)w 
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5 

m  E  S 

-V^(0m)w 

< 

5 

m  G  {M  U  iS} 

The  index  sets  A4  and  S  correspond  to  the  mainlobe  and  sidelobe  regions,  respec¬ 
tively.  In  order,  the  constraints  in  Equation  4.3.1  enforce  distortionless  response, 
avoid  mainlobe  “squinting,”  upper-bound  the  sidelobe  region,  and  lower-bound  the 
beampattern.  The  minimization  in  Problem  4.3.1  is  carried  out  over  the  variables  6 
and  w.  The  objective  and  constraints  in  Problem  4.3.1  are  linear  functions  of  these 
variables,  so  the  problem  constitutes  a  linear  program.  Because  linear  programs  are 
convex  (see  [25]),  the  minimax  weights  are  a  unique,  global  optimum.  Rehned  numer¬ 
ical  algorithms  exist  that  solve  linear  programs  quickly  (worst-case  polynomial  time) 
to  a  very  high  precision.  Two  of  the  most  common  algorithms  are  interior  point  and 
active  set  (ie.  simplex)  methods  [26,  28]. 

Figure  4.3.2  provides  an  example  VSA  minimax  beampattern.  The  beampatterns 
shown  use  the  same  parameters  as  the  other  sections:  an  iV  =  10  element  vector¬ 
sensor  array  at  frequency  /  =  5/7/^,  steered  to  0o  =  — 7r/4.  The  mainlobe  region 
corresponds  to  the  ~  23  dB  beamwidth  of  the  conventional  beampattern.  Although 
the  sensitivity  factor  of  the  minimax  weights  is  very  high  (?^  154  dB),  the  differ- 
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ence  between  the  original  (conventional)  and  minimax  beampatterns  in  Figure  4.3.2 
is  striking.  The  sidelobes  of  the  minimax  beampattern  are  ^31  dB,  extremely  low 
compared  to  the  original  beampattern  and  at  least  6  dB  lower  than  any  other  beam- 
pattern  in  this  section.  By  dehnition,  the  maximum  sidelobe  level  of  the  minimax 
beampattern  is  the  lowest  possible.  Furthermore,  the  minimax  beampattern  achieves 
such  a  low  sidelobe  level  with  a  narrower  mainlobe  than  the  conventional  beampattern. 
Minimax  beampatterns  appear  signihcantly  better  than  the  alternatives  when  only 
considering  the  design  objectives  of  mainlobe  width  and  sidelobe  level. 

Unfortunately,  the  minimax  beampattern  shown  in  Figure  4.3.2  is  “too  good  to  be 
true”  in  practice.  Recall  the  objectives  of  hxed  weight  design:  narrow  beamwidth,  low 
sidelobe  levels,  and  low  sensitivity.  Although  the  minimax  beampattern  is  optimal  in 
terms  of  the  first  two  objectives,  its  sensitivity  to  modeling  errors  often  makes  it  im¬ 
practical.  Parameters  subject  to  modeling  errors  are  commonly  treated  as  stochastic. 
Following  the  work  in  [15,  1],  this  thesis  uses  the  extended  Gilbert-Morgan  mismatch 
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model,  treating  the  vector-sensor  gains,  phases,  positions,  and  rotations  as  zero-mean 
Gaussian  random  variables  with  standard  deviations 


Parameter 

Std.  Dev. 

Units 

Gain 

0.1 

Unitless 

Phase 

10 

Degrees  (°) 

Position 

0.1 

Wavelengths  (A) 

Rotation 

10 

Degrees  (°) 

These  parameters  represent  a  scenario  with  a  substantial  amount  of  mismatch.  The 
beampattern  with  stochastic  parameters  becomes  the  expected  response  power  as  a 
function  of  angle,  or 


5(0)  ^  E{||/(0)|2} 

=  E{|w^v(0)p} 

=  W^Rmm(0)w,  (4.3.2) 

where 

R^,,(0)  ^E{v(0)v^(0)}  (4.3.3) 

is  the  covariance  matrix  of  a  unit-power  signal  from  angle  0.  Note  that  this  quadratic 
form  of  the  beampattern  is  valid  for  any  stochastic  model.  The  minimax  and  con¬ 
ventional  beampatterns  from  Figure  4.3.2  are  shown  under  this  mismatch  model  in 
Figure  4.3.3.  Note  the  difference  in  scale  from  Figure  4.3.2.  In  this  example,  the 
extreme  sensitivity  of  the  minimax  beampattern  renders  it  useless  for  practical  array 
processing.  The  large  magnitude  of  each  element  in  the  minimax  weight  vector  leads 
to  an  intuitive  understanding  of  this  sensitivity.  To  obtain  the  low  sidelobe  levels  and 
narrow  mainlobe  shown  in  Figure  4.3.2,  the  minimax  weights  magnify  minute  differ¬ 
ences  between  the  (ideal)  responses  of  the  sensors.  Errors  in  the  element  responses 
are  typically  much  larger  than  these  differences  and  are  magnified  by  the  minimax 
weights,  resulting  in  an  unpredictable  beampattern.  One  reason  that  the  minimax 
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Figure  4.3.3:  Effects  of  mismatch  on  a  minimax  beampattern 


criterion  is  sometimes  more  effective  in  filter  design  than  beampattern  design  is  the 
uncertainties  involved  in  array  processing.  Errors  in  temporal  sampling  {i.e.  clock 
jitter)  are  typically  much  smaller  than  errors  in  spatial  sampling  [i.e.  position  errors). 

Although  the  minimax  beampattern  is  extremely  sensitive,  it  serves  two  important 
purposes.  First,  it  provides  a  bound  on  the  achievable  mainlobe  and  sidelobe  levels 
for  any  hxed  weight.  The  mainlobe  in  Figure  4.3.2  is  only  slightly  narrower  than  the 
conventional  beampattern;  the  difference  is  far  less  than  half  the  beamwidth.  This 
suggests  that  no  VS  A  beampattern  with  acceptable  sidelobe  levels  has  a  mainlobe 
that  is  signihcantly  narrower  than  the  conventional.  A  theoretical  basis  for  this  state¬ 
ment  appears  in  the  direction-of-arrival  bound  in  [1]  (see  Figure  1.4.1).  Second,  the 
minimax  beampattern  reveals  that  vector-sensor  array  beampatterns  sometimes  defy 
conventional  wisdom  and  must  be  carefully  designed.  Assuming  a  typical  relationship 
between  mainlobe  and  sidelobe  levels,  for  instance,  results  in  atypical  behavior. 
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4.3.3  The  Minimum  Sensitivity  Criterion 


To  avoid  the  extreme  sensitivity  demonstrated  by  the  minimax  beampatterns,  an  al¬ 
ternative  criterion  must  balance  all  three  design  objectives.  This  section  proposes  a 
“minimum  sensitivity”  criterion  which  does  just  that,  resulting  in  signihcant  improve¬ 
ments  and  allowing  explicit  trade-offs  between  the  three  objectives.  The  minimum 
sensitivity  criterion  applied  to  vector-sensor  array  processing  is  a  key  contribution  of 
this  work,  but  the  same  concepts  appear  elsewhere  in  filter  design  ([23])  and  array 
processing  ([24]).  The  following  paragraphs  introduce  the  criterion,  provide  a  few 
examples,  and  discuss  one  important  trade-off  involved  in  the  design. 

The  previous  section  illustrates  the  need  to  unify  the  design  objectives  of  a  nar¬ 
row  mainlobe,  low  sidelobes,  and  low  sensitivity  into  one  design  criterion.  The  key 
question  is  how  to  relate  the  intuitive  notion  of  a  “good”  beampattern  in  terms  of 
these  competing  objectives.  The  first  step  toward  answering  this  question  involves 
“Pareto  optimal”  solutions.  Although  a  full  discussion  of  multi-objective  optimization 
is  outside  the  scope  of  this  document,  Pareto  optimal  solutions  capture  the  intuition 
of  multi-objective  optimality.  A  solution  is  Pareto  optimal  if  improving  it  along  one 
objective  must  necessarily  worsen  another.  For  any  solution  that  is  not  Pareto  opti¬ 
mal,  a  Pareto  optimal  solution  exists  that  is  better  along  at  least  one  objective  and 
no  worse  along  the  others.  Therefore,  the  set  of  Pareto  optimal  solutions  includes 
every  preferred  solution  to  the  multi-objective  problem. 

One  approach  for  exploring  the  set  of  Pareto  optimal  solutions  is  to  minimize 
one  objective  and  constrain  the  others.  For  the  three-objective  beampattern  design 
problem,  this  yields  only  three  candidate  criteria.  First,  one  could  minimize  the  side- 
lobe  level  and  constrain  the  beamwidth  and  sensitivity.  This  candidate  suffers  from 
two  problems:  1)  sensitivity  is  the  most  difficult  of  the  three  objectives  to  interpret 
and  thus  constrain,  and  2)  the  criterion  allows  for  undesirable,  large  variations  in 
sidelobe  level  from  beam  to  beam.  A  second  candidate  criterion  is  to  minimize  the 
beamwidth  and  constrain  the  sidelobe  level  and  sensitivity.  As  with  the  first  can¬ 
didate,  the  sensitivity  constraint  is  difficult  to  interpret.  Furthermore,  the  resulting 
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minor  improvements  in  beamwidth  may  not  be  worthwhile.  The  third  and  best  candi¬ 
date  criterion  is  to  minimize  the  sensitivity  and  constrain  the  beamwidth  and  sidelobe 
level.  Choosing  reasonable  sidelobe  and  beamwidth  constraints  is  straightforward,  so 
this  criterion  yields  Pareto  optimal  beampatterns  with  little  tuning. 

“Minimum  sensitivity”  designs  do  just  as  the  name  suggests:  they  minimize  the 
sensitivity  factor  subject  to  a  given  mainlobe  region  and  sidelobe  level.  Section  2.2 
defines  the  (normalized)  VSA  sensitivity  factor  as 

^  =  2N  w^w  (4.3.4) 

=  w'^Qw.  (4.3.5) 

The  diagonal  matrix  Q  arises  because  of  the  transformation  from  the  real  weight 
vector  w  to  the  complex  weight  vector  w.  The  form  of  Q  depends  on  the  number  of 
elements  (odd  or  even)  but  is  clear  from  the  transformation  (see  Section  2.1.2);  an 
even  number  of  elements  gives  Q  =  4iV-I.  Because  the  sensitivity  factor  is  a  quadratic 
form,  minimum  sensitivity  weights  are  the  solutions  to  a  quadratic  program.  Section 
4.3.2  demonstrates  that  the  sidelobe  constraints  are  linear.  The  beamwidth  con¬ 
straint  simply  determines  the  width  of  the  mainlobe  region  in  the  spatial  quantization 
scheme.  The  minimum  sensitivity  problem  is  therefore 
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where  Ai  and  S  represent  the  mainlobe  and  sidelobe  regions  as  in  Problem  4.3.1. 
Note  that  the  optimization  is  performed  only  over  w;  the  maximum  sidelobe  level, 
S,  is  hxed.  The  matrix  Q  is  positive  dehnite,  so  the  quadratic  program  is  strictly 
convex  with  a  unique  global  optimum.  Rehned  algorithms  exist  that  solve  Problem 
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4.3.6  quickly  (in  worst-case  polynomial  time)  and  to  high  precision  [26,  28].  Section 
4.3.4  discusses  one  such  algorithm  that  solves  Problem  4.3.6  efficiently  by  leveraging 
the  special  structure  of  the  array  processing  problem. 

Figure  4.3.4  provides  an  example  of  a  minimum  sensitivity  beampattern.  As 
before,  the  beampattern  is  for  an  =  10  element  vector-sensor  array  steered  to 
00  =  — Tr/d  at  frequency  /  =  5/7/^.  The  maximum  sidelobe  constraint  is  —25  dB. 
The  mainlobe  region  is  determined  by  the  width  of  a  corresponding  pressure-sensor  ar¬ 
ray  beampattern  using  a  25  dB  Taylor  taper.  As  expected,  the  sidelobes  and  backlobe 
are  low  (no  higher  than  —25  dB)  and  the  mainlobe  is  comparable  to  the  conventional 
beampattern.  By  reducing  the  backlobe  to  a  low  sidelobe  level,  the  minimum  sensitiv¬ 
ity  beampattern  in  Figure  4.3.4  resolves  the  pressure-sensor  ambiguity  as  effectively 
as  it  resolves  signals  from  any  other  direction. 

Figure  4.3.4  is  an  excellent  result,  but  the  true  strength  of  the  minimum  sen¬ 
sitivity  criterion  is  illustrated  in  Figure  4.3.5.  Under  the  same  mismatch  scenario 
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Angle  (radians) 

Figure  4.3.5:  Effects  of  mismatch  on  a  minimum  sensitivity  beampattern 

as  Section  4.3.2,  the  minimum  sensitivity  weights  from  Figure  4.3.4  still  effectively 
resolve  the  pressure-sensor  ambiguity.  The  effect  of  the  mismatch  is  similar  to  beam- 
pattern  modulation  with  directional  additive  noise  (see  [1,  15]).  As  the  sensitivity 
factor  increases,  the  noise  level  increases  and  the  signal  gain  decreases.  The  mini¬ 
mum  sensitivity  weights  reduce  the  level  of  additive  noise  close  to  the  conventional 
beampattern  and  maintain  the  same  level  of  gain.  In  short,  the  minimum  sensitivity 
beampattern  is  approximately  as  robust  as  the  conventional  beampattern  but  without 
the  backlobe.  For  comparison,  the  normalized  sensitivity  factor  is  ~  154  dB  for  the 
minimax  weights  in  Figure  4.3.3  and  ^  ~  1.37  dB  for  the  minimum  sensitivity  weights 
in  Figure  4.3.5.  The  minimum  value  of  ^  is  unity,  or  0  dB. 

Minimum  sensitivity  beampatterns  have  a  number  of  interesting  properties  that 
merit  further  discussion.  Figures  4.3.6  and  4.3.7  illustrate  a  second  example  beampat¬ 
tern  in  ideal  and  mismatched  scenarios.  The  beampatterns  in  these  hgures  arise  from 
a  =  20  element  vector-sensor  array  at  frequency  /  =  5/7 fd,  steered  near  endhre  at 
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00  =  — 7r/8.  The  mismatch  parameters  are  the  same  as  before;  the  sidelobe  constraint 
remains  —25  dB.  Figure  4.3.6  reveals  that  the  mainlobe  and  backlobe  are  merged  when 
the  look  direction  is  within  a  beamwidth  of  array  endhre.  Attempting  to  reduce  the 
backlobe  to  a  low  level  is  often  infeasible  or  the  resulting  weights  are  extremely  sen¬ 
sitive.  Chapter  3  provides  a  theoretical  basis  for  this  endhre  performance  problem. 
The  elegant  solution,  illustrated  by  the  minimum  sensitivity  beampattern  in  Figure 
4.3.6,  extends  the  mainlobe  region  into  the  backlobe  to  widen  the  beamwidth  until 
the  weights  become  less  sensitive.  For  the  example  shown,  the  mainlobe  is  extended 
until  the  sensitivity  factor  fell  to  .^  =  3  dB.  Note  that  the  resulting  minimum  sensitiv¬ 
ity  beampattern  improves  the  ambiguity  resolution,  maintains  a  narrow  beamwidth, 
and  achieves  low  sidelobe  levels.  Another  property  demonstrated  in  Figures  4.3.6  and 
4.3.7  is  sidelobe  roll-off.  In  this  respect,  the  minimum  sensitivity  beampatterns  are 
very  similar  to  the  popular  Taylor  taper.  As  the  sidelobe  constraint  becomes  more 
restrictive,  more  sidelobes  meet  it  with  equality.  If  the  sidelobe  constraint  is  very  high 
(not  binding),  the  minimum  sensitivity  weight  equals  the  conventional  weight.  With 
the  lowest  sidelobe  level,  the  minimum  sensitivity  weight  is  the  minimax  weight. 

One  important  issue  in  the  design  of  minimum  sensitivity  weights  is  the  trade¬ 
off  between  sensitivity  and  maximum  sidelobe  level.  As  the  maximum  sidelobe  level 
decreases,  the  sensitivity  increases  monotonically.  Figure  4.3.8  illustrates  a  typical 
curve  of  minimum  sensitivity  versus  sidelobe  level  for  the  familiar  case  with  N  =  10, 
/  =  5/7/rf,  and  0o  =  — 7r/4.  The  minimum  sensitivity  weights  delineate  the  gray 
region  in  Figure  4.3.8.  This  region  represents  unachievable  combinations  of  sidelobe 
level  and  sensitivity.  At  one  extreme,  the  minimax  weights  achieve  the  lowest  abso¬ 
lute  sidelobe  level.  At  the  other  extreme,  the  conventional  weights  achieve  the  lowest 
absolute  sensitivity.  The  minimum  sensitivity  curve  connecting  these  points  divides 
into  two  regimes.  The  first  regime  in  Figure  4.3.8  occurs  for  sidelobe  levels  above 
~  —25  dB,  where  a  marginal  decrease  in  sidelobe  level  changes  the  sensitivity  very 
little.^  The  weights  in  this  regime  are  generally  well-behaved:  coarse  spatial  quan- 

^The  “boundary”  between  the  two  regimes  depends  on  the  expected  level  of  mismatch.  The 
choice  of  «  —25  dB  is  explained  in  Section  4.3.4. 


Ill 


Maximum  Sidelobe  Level  (dB) 


Figure  4.3.8:  Miuimum  seusitivity  versus  maximum  sidelobe  level 

tizatiou  yields  acceptable  weights,  solviug  Problem  4.3.6  is  uumerically  stable  aud 
efficieut  iu  practice,  aud  the  beampatterus  are  robust.  The  secoud  regime  iu  Figure 
4.3.8  occurs  for  sidelobe  levels  below  ^  —25  dB,  where  a  margiual  decrease  iu  side¬ 
lobe  level  siguihcautly  iucreases  the  seusitivity.  Decreasiug  the  sidelobe  level  of  the 
ideal  beampatteru  iu  this  regime  actually  increases  the  expected  sidelobe  level  uuder 
a  mismatch  sceuario.  The  weights  iu  this  secoud  regime  are  sometimes  uumerically 
uustable,  difficult  to  compute  iu  practice,  aud  require  a  fiue  spatial  quautizatiou.  Fig¬ 
ure  4.3.8  reveals  the  importauce  of  cousideriug  both  seusitivity  aud  sidelobe  level  iu 
the  desigu  process.  Although  it  is  uot  showu  here,  the  curve  of  miuimum  seusitivity 
versus  beamwidth  exhibits  similar  behavior. 

4.3.4  Advanced  Optimization  Topics 

The  criteria  discussed  iu  Sectious  4.3.2  aud  4.3.3  provide  a  geutle  iutroductiou  to 
fixed  weight  desigu  usiug  couvex  optimizatiou.  Several  related  optimizatiou  topics 
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are  worth  mentioning  here,  but  a  detailed  discussion  is  outside  the  scope  of  the 
thesis.  This  subsection  first  provides  heuristics  and  implementation  details  for  the 
VSA  minimum  sensitivity  problem.  It  then  highlights  the  role  of  convex  optimization 
in  alternative  design  criteria. 

Section  4.3.3  notes  that  the  minimum  sensitivity  criterion  requires  “reasonable” 
choices  for  sidelobe  level  and  beamwidth.  In  practice,  simple  heuristics  aid  in  choosing 
both  parameters.  The  approach  taken  here  sets  the  maximum  sidelobe  level  slightly 
below  the  average  expected  sidelobes  of  a  mismatched,  conventional  beampattern 
(see  [10],  [1,  §3.1],  and  [15]).  Intuitively,  this  choice  makes  the  sidelobes  negligible 
compared  to  the  noise  introduced  by  the  mismatch.  A  reasonable  mainlobe  region 
is  chosen  based  upon  a  standard  beamwidth,  then  expanded  as  necessary  until  the 
sensitivity  factor  falls  below  a  threshold. 

Many  algorithms  are  capable  of  solving  quadratic  programs  like  Problem  4.3.6, 
but  the  special  structure  of  minimum  sensitivity  design  favors  a  modihed  active-set 
method  [28,  §16.6].  A  properly  implemented  active-set  algorithm  for  this  problem  is 
more  precise,  uses  less  memory,  and  requires  orders  of  magnitude  less  computation 
than  a  general  purpose  solver.  Active-set  methods  are  similar  in  structure  to  the 
Remez  exchange  algorithm:  they  solve  quadratic  programs  by  determining  the  small 
set  of  constraints  “active”  at  the  optimum  and  solving  a  simpler,  equality-constrained 
problem.  Each  iteration  of  the  active-set  method  requires  little  computation.  Only  a 
few  iterations  are  typically  required  to  solve  the  problem  from  a  good,  feasible  starting 
point.  Adding  an  (exact)  penalty,  commonly  referred  to  as  the  “big  M”  method, 
allows  starting  from  any  initial  guess.  Weights  computed  at  nearby  frequencies  or 
beams  often  provide  an  excellent  “warm-start”  to  the  algorithm. 

The  two  criteria  discussed  in  this  chapter  are  not  the  only  criteria  possible  with 
convex  optimization.  Alternative  criteria  which  seem  intractable  have  efficient  nu¬ 
merical  solutions  as  long  as  the  problems  are  convex  [25].  One  example  is  weight 
design  based  upon  the  expected  beampattern.  Consider  an  arbitrary  array  (including 
any  pressure-  or  vector-sensor  array)  under  any  deterministic  or  stochastic  model. 
Each  point  on  the  expected  beampattern  is  a  positive  semidehnite  quadratic  form 
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Figure  4.3.9:  VSA  beampatterns  in  azimuth  and  elevation 


(like  Equation  4.3.2).  The  minimax  criterion  applied  to  the  expected  beampattern 
yields  a  convex,  second-order  cone  program  (SOCP)  that  is  efficient  to  solve  [24,  26]. 


4.3.5  Beampatterns  in  Azimuth  and  Elevation 

The  spatial  quantization  scheme  discussed  in  Section  4.3.1  only  constrains  the  beam- 
pattern  of  a  horizontal  vector-sensor  array  at  zero  elevation,  i.e  in  the  horizontal 
plane.  The  following  provides  an  example  and  brief  argument  that  additional  eleva¬ 
tion  constraints  are  unnecessary.  The  ability  of  a  horizontal  VSA  to  resolve  signals 
at  the  same  conical  angle  is  limited.  This  resolution  arises  only  from  the  direc¬ 
tional  elements;  there  is  no  vertical  aperture  to  a  horizontal  array.  Constraining  the 
beampattern  in  the  horizontal  plane  applies  two  constraints  per  conical  angle,  so  the 
constrained  beampattern  has  little  freedom  in  the  elevation  dimension.  Figure  4.3.9 
confirms  this  behavior  with  2-D  contours  of  the  beampatterns  from  Figure  4.3.4.  The 
look  direction  is  marked  with  a  star.  The  minimum  sensitivity  beampattern  is  lower 
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than  the  conventional  beampattern  almost  everywhere,  even  at  nonzero  elevation 
where  there  are  no  constraints.  Constraining  the  beampattern  of  a  horizontal  VSA 
in  the  horizontal  plane  produces  a  weight  that  is  well-behaved  at  all  elevation  angles. 


4.4  Performance  of  Fixed  Weights 

The  stated  goal  of  this  chapter  is  the  design  of  fixed  weights  that  achieve  good  perfor¬ 
mance  like  that  predicted  in  Chapter  3.  Section  4.3.3  describes  the  design  of  minimum 
sensitivity  weights  achieving  1)  a  narrow  mainlobe,  2)  low  sidelobe  levels,  and  3)  low 
sensitivity.  Properties  (2)  and  (3)  ensure  that  these  weights  achieve  performance  close 
to  the  estimation  bound  in  Section  3.2  for  all  but  high-power  interference. 

For  any  fixed  weight,  the  NRMSE  performance  as  described  in  Chapter  3  depends 
only  on  the  sensitivity  factor  and  the  backlobe  rejection.  The  mean-squared  error  for 
the  estimation  scenario  illustrated  in  Figure  3.2.1  is 

MSEfixed  =  E{(ao-tto)^} 

=  ciq  —  2q;oE  {cio}  +  E  |q;q}  .  (4.4.1) 

The  power  estimate,  So,  obtained  by  any  hxed  weight  is  a  scaled  Chi-square  random 
variable  with  two  degrees  of  freedom,  so  E  {Sq}  =  2E 

MSEfixed  =  ciq  —  2q;oE  {ttg}  +  2E  {ciq}  .  (4.4.2) 

The  beam  output  power  for  the  white  noise  scenario  in  Section  3.2  is 

E  {So}  =  ao  -I-  -|-  ai | w^v(— 0o) (4.4.3) 

The  first  term  in  Equation  4.4.3  is  the  target  power,  which  remains  undistorted  by 
the  hxed  weight.  The  second  term  is  the  output  power  due  to  white  noise  and  does 
not  depend  on  the  interference.  The  third  term  represents  the  backlobe  interference 
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Array  JNR  (dB) 


Figure  4.4.1:  Efficiency  of  an  example  fixed  weighting 


“leakage.”  When  the  array  JNR  is  small  compared  to  the  backlobe  rejection,  e.g. 

the  interference  leakage  is  negligible  and  the  NRMSE  is  nearly  constant  with  JNR.  Eor 
interference  power  above  this  threshold,  the  NRMSE  increases  quickly  and  diverges 
from  the  lower  bound.  The  CRB  is  not  a  strict  bound  in  this  case  because  the 
estimator  is  biased,  but  it  still  commonly  used  for  comparison  [5,  Chapters  8  and  9]. 

Because  minimum  sensitivity  weights  have  low  sidelobes  and  low  sensitivity,  their 
NRMSE  performance  remains  close  to  the  bonnd  unless  the  JNR  is  high.  Eignre  4.4.1 
illustrates  the  NRMSE  performance  of  the  minimnm  sensitivity  weight  from  Figures 
4.3.4  and  4.3.5.  The  performance  of  the  hxed  weight  only  diverges  from  the  bound 
at  JNR  above  the  —25  dB  sidelobe  level.  Fixed  weights  exist  with  better  NRMSE 
performance  nnder  this  specihc  scenario,  bnt  minimnm  sensitivity  weights  perform 
well  for  any  sidelobe  interferer  without  a  priori  knowledge  of  interference  location. 
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Chapter  5 


Subspace  Adaptive  Processing 


Chapter  4  describes  the  design  of  robust,  hxed  weights  for  vector-sensor  array  pro¬ 
cessing.  The  minimum  sensitivity  beampatterns  in  Section  4.3.3  achieve  substantial 
improvements  over  existing  techniques,  but  hxed-weight  beamforming  is  fundamen¬ 
tally  limited  in  two  important  areas.  First,  the  beamformer  resolution  is  restricted  by 
the  beamwidth.  Second,  the  interference  rejection  is  restricted  by  the  sidelobe  levels. 
Improving  performance  in  either  area  motivates  the  use  of  data  adaptive  beamform¬ 
ing. 

Adaptive  beamforming  (ABF)  improves  resolution  and  interference  rejection  by 
adjusting  the  array  weights  based  upon  observed  data.  A  thorough  introduction  to 
adaptive  processing  is  provided  in  [5].  Adaptive  processing  typically  proceeds  in  two 
steps.  First,  the  second-order  statistics  of  the  data  are  estimated  in  the  form  of  a 
covariance  matrix.  This  chapter  focuses  on  estimation  using  the  “sample  covariance 
matrix,” 

(5.0.1) 

k=l 

because  of  its  rapid  convergence  [9].  Recall  from  Section  3.1  that  K  is  the  number  of 
observations,  or  “snapshots.”  The  second  step  in  adaptive  processing  computes  weight 
vectors  by  substituting  the  sample  covariance  matrix  for  the  true,  or  theoretical, 
covariance  matrix.  This  step  solves  the  MVDR  problem  in  Equations  4.2.1  and  4.2.2 
using  the  sample  covariance  matrix,  R. 
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The  key  problem  in  adaptive  vector-sensor  array  processing  is  large  dimension. 
The  data  dimension,  Dq,  increases  from  Dq  =  N  with  a  pressure-sensor  array  to 
Df)  =  4N  with  a  vector-sensor  array.  As  mentioned  in  Section  1.4.3,  adaptive  process¬ 
ing  requires  0{D^)  computation  and  0{D)  training  time.  Increasing  H  by  a  factor  of 
four  necessitates  algorithms  that  require  less  computation  and  converge  more  quickly. 
Processing  power  is  easily  increased,  but  training  time  is  fundamentally  limited  by 
the  stationarity  of  the  data  (see  Section  1.4.3). 

The  work  in  this  chapter  does  not  overcome  the  fundamental  problems  of  large  di¬ 
mension,  it  circumvents  them  by  reducing  the  problem  dimension  without  noticeably 
affecting  optimum  performance.  Sections  5.1  and  5.2  introduce  and  formalize  sub¬ 
space  processing.  Section  5.3  derives  an  “optimum”  subspace  appropriate  for  beam¬ 
forming  an  entire  region.  After  a  brief  discussion  of  the  mathematics  behind  subspace 
design,  the  chapter  concludes  by  analyzing  the  performance  of  adaptive  processing. 
Key  contributions  of  this  chapter  are  1)  a  theoretical  framework  for  subspace  design, 
2)  a  derivation  of  eigenbeam  transformations  within  this  framework,  and  3)  an  ap¬ 
proximation  revealing  the  substantial  dimension  reduction  achieved.  The  techniques 
in  this  chapter  apply  to  any  array,  but  the  analysis  and  results  are  VSA-specihc. 


5.1  Introduction  to  Subspace  Techniques 

Subspace  techniques  reduce  both  computation  and  training  time  by  performing  stan¬ 
dard  adaptive  processing  in  a  low  dimensional  subspace.  In  standard  adaptive  pro¬ 
cessing,  the  input  data  is  fed  directly  into  some  adaptive  processor.  The  dimension  of 
this  “element-space”  adaptive  problem  equals  the  data  dimension,  or  D  =  Dq.  Sub¬ 
space  adaptive  processing,  illustrated  by  the  block  diagram  in  Figure  5.1.1,  projects 
the  data  into  a  low  dimensional  subspace  before  applying  adaptive  processing.  The 
replica  vectors  used  in  subspace  adaptive  processing  pass  through  the  same  transfor¬ 
mation.  The  dimension  of  the  adaptive  problem  is  reduced  to  the  dimension  of  the 
subspace,  D  Dq.  The  reduced-dimension  adaptive  problem  requires  less  computa¬ 
tion  and,  more  importantly,  less  training  time  than  the  element-space  scheme. 
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Figure  5.1.1:  Adaptive  processing  in  low-dimensional  subspaces 


Subspace  adaptive  processing  has  several  advantages  over  other  techniques  for 
dealing  with  large  dimension.  First,  it  involves  a  simple  linear  transformation  of 
the  data.  The  standard  processing  techniques  apply  without  modification  to  the 
transformed  data:  jointly  Gaussian  data  remains  jointly  Gaussian,  conventional  and 
adaptive  processing  retain  their  properties,  and  optimum  processing  takes  the  same 
form.  Second,  it  is  compatible  with  other  techniques.  Proven  techniques  such  as  di¬ 
agonal  loading  and  dominant  mode  rejection  remain  effective  [10,  11];  promising  new 
techniques  such  as  PGML  still  apply  to  the  transformed  data  [12].  Third,  subspace 
processing  is  computationally  efficient,  especially  for  static  arrays.  For  transforma¬ 
tions  computed  offline,  the  cost  of  applying  the  transformation  is  offset  by  the  savings 
of  inverting  a  smaller  matrix. 

Before  discussing  the  fundamentals  of  subspace  design,  the  relationship  between 
subspaces  and  orthonormal  transformation  matrices  is  important  to  clarify.  The  set 
of  all  H-dimensional  subspaces  in  is  called  the  (complex)  Grassmann  manifold, 
G{D,Dq).  Each  subspace  can  have  inhnitely  many  orthonormal  bases,  related  by 
rotation,  so  there  is  generally  not  a  one-to-one  correspondence  between  ZIq) 
and  the  orthonormal  matrices  |p  e  C^oxD  I  pHp  ^  The  loose  notation  P  G 
G{D,  Do)  indicates  that  P  is  an  orthonormal  basis  for  one  subspace  in  G{D,  Dq).  The 
optimal  subspace  is  often  unique,  but  the  transformation  matrix  is  not.  For  a  full 
treatment  of  optimization  on  the  Grassmann  manifold,  see  [29]. 
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5.2  Inner  Product  Preservation 


The  primary  obstacle  to  developing  a  theory  of  subspace  processing  is  the  lack  of  a 
useful  optimality  criterion.  Many  proven  techniques  such  as  beamspace  processing 
[5,  §7.10]  and  subarray  processing  are  easily  framed  as  subspace  methods,  but  their 
choice  of  subspace  is  based  on  intuition.  Comparing  different  subspaces  requires  a 
dehnition  of  optimality,  an  ideal  subspace.  The  goal  of  subspace  processing  is  to 
reduce  dimension  but  leave  the  optimum  output  unchanged.  Under  what  conditions 
is  the  subspace  transformation  “lossless”? 

Subspace  optimum  processing  is  equivalent  to  element-space  optimum  processing 
if  and  only  if  the  subspace  transformation  is  inner  product  preserving.  To  ensure 
equivalent  processing  for  the  signals  in  V,  an  orthonormal  transformation  matrix  P 
must  only  satisfy 


(vo,  vi)  =  (P%o,  P^vi)  Vvo,vieV.  (5.2.1) 

An  informal  proof  of  Equation  5.2.1  is  straightforward:  inner  product  preservation 
is  equivalent  to  V  C  span(P),  so  applying  the  transformation  performs  a  change  of 
basis  and  leaves  the  optimum  output  unaltered. 

Many  conditions  are  equivalent  to  Equation  5.2.1,  but  the  chosen  form  is  most 
useful  for  several  reasons.  First,  it  provides  a  quantitative  measure  of  subspace  per¬ 
formance  that  is  useful  for  comparison  and  design.  Second,  it  provides  intuition  about 
how  errors  affect  the  output.  If  the  norm  of  a  vector  is  not  preserved,  the  output 
signal-to-noise  ratio  is  lowered.  If  the  inner  product  between  two  vectors  is  not  fully 
preserved,  the  ability  to  distinguish  between  the  two  signals  {e.g.  nulling)  is  affected. 
Third,  inner  products  naturally  extend  concepts  from  hlter  design  to  the  multidimen¬ 
sional  case.  The  one-dimensional  subspace  design  problem  is  closely  related  to  hlter 
design;  norms  are  equivalent  to  the  magnitude  response  of  the  hlter. 

The  power  of  Equation  5.2.1  lies  in  approximation.  An  ideal  subspace  satisfying 
Equation  5.2.1  with  equality  is  often  not  useful  in  practice.  The  ideal  subspace  for 
many  problems  is  simply  element-space,  or  P  =  I.  Although  it  sometimes  takes 
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the  full  dimension  to  satisfy  with  equality,  inner  product  preservation  is  often  well- 
approximated  with  low  dimensional  subspaces. 


5.3  Regional  Projections 

A  natural  starting  point  for  subspace  design  appears  in  the  dehnition  of  inner  product 
preservation.  How  can  Equation  5.2.1  be  approximated  to  support  low-dimensional 
processing  over  an  entire  region?  This  section  provides  one  answer  in  the  form  of 
“eigenbeam”  transformations.  Eigenbeam  transformations  are  not  new  to  array  pro¬ 
cessing  (see  [30]),  but  the  justihcation  given  for  their  use  is  often  vague.  This  section 
derives  eigenbeam  transformations  as  the  subspaces  minimizing  the  squared  error  of 
the  inner  products. 

5.3.1  Problem  Description 

Section  5.2  derives  conditions  corresponding  to  the  “ideal”  subspace.  The  design 
problem  is  now  reduced  to  approximation,  or  selecting  a  subspace  that  minimizes 
some  measure  of  deviation  from  the  ideal.  There  are  many  useful  error  metrics, 
each  of  which  dehnes  an  “optimal”  subspace.  This  section  describes  the  problem  of 
minimizing  a  common  and  tractable  metric,  the  total  squared  error. 

Applying  the  minimum  squared  error  criterion  to  subspace  design  involves  formally 
dehning  the  error  and  stating  the  problem.  Section  5.2  reveals  that  the  error  between 
the  ideal  subspace  and  any  orthonormal  transformation,  P,  is  captured  by  the  error 
in  the  inner  products.  For  any  two  vectors  {vq,  Vi},  the  error  in  the  inner  product  is 

e{vo,  vi}  =  -  v^PP^Vi.  (5.3.1) 

The  error  implicitly  depends  on  the  transformation  P.  Equation  5.3.1  must  be  con¬ 
sidered  over  all  pairs  of  vectors  in  some  region,  /C,  of  the  manifold.  The  /C  considered 
in  this  chapter  are  regions  in  cosine-space,  or  u-space,  but  the  derivation  is  valid  for 
regions  dehned  in  any  parameterization  of  the  manifold  (angular-space,  wavenumber- 


121 


space,  etc  ).  The  error  is  minimized  over  the  set  of  manifold  vectors  in  the  region, 


V=  {v(k)  I  kG/C}.  (5.3.2) 

Writing  the  minimization  in  terms  of  the  region,  /C,  gives  the  optimization  problem 

min  [  j  |e  {v(ko),  v(ki)}|^  dk^dki.  (5.3.3) 

PeG(D,Do) 

Using  the  loose  notation  vq  =  v(ko)  and  vi  =  v(ki)  makes  the  dependence  on  k 
implicit  and  produces  a  compact  form  of  the  problem; 

min  /  /  |e{vo,vi}|^  dkodki.  (5.3.4) 

PeG(D,Do) 

The  double  integral  over  /C  in  Equation  5.3.4  captures  every  pair  of  inner  products. 

5.3.2  Solution:  Eigenbeams 

The  optimization  problem  in  Equation  5.3.4  seems  very  difficult  at  hrst  glance.  It 
is  non-convex,  requires  a  search  over  complex  manifolds,  and  involves  difficult  inte¬ 
grals.  The  solution,  however,  is  powerful  and  elegant.  The  global  optimum  is  easily 
computed  by  a  singular  value  decomposition  to  high  numerical  precision. 

The  global  optimum  to  Problem  5.3.4  is  derived  in  three  steps.  First,  the  problem 
is  modihed  to  search  over  the  orthogonal  complement  of  the  desired  subspace.  The 
inner  product  error  in  Equation  5.3.1  is  easily  written  as 

e{vo,  vi}  =  v^,  (5.3.5) 

where  P^  G  G(Zlo  ~  -D,  Dq)  is  an  orthonormal  basis  for  the  null-space  of  P.  Finding 
P  in  Problem  5.3.4  is  equivalent  to  Ending  P^  in 

min  /  /  |v^P_i_P;^Vj^|^  dkodki.  (5.3.6) 

p_LeG(Do-D,i:>o) 


122 


Second,  the  integrals  are  removed  from  the  problem.  Expanding  the  objective  in 
Equation  5.3.6  gives 


vf  P_lP^Vo  dko  dki 


K  JK 


/  /  tr  (pf  VoV^P_L  pf  V^vf  P_l)  dko  dki 
Jic  Jk 

v,v"dk„j  P?v,vfp_ 


dki 


=  tr 


P^l  /  VoVo^dkojP^P^ 

'K. 


j  Vivfdki^  P^ 


=  tr 


PaRicPa)' 


(5.3.7) 


where  Rj^c  is  the  covariance  matrix 

R-iC  —  [  v(k)v^(k)dk.  (5.3.8) 

Jk 

The  first  step  above  treats  the  scalar  integrand  as  the  trace  of  a  1  x  1  matrix  and 
uses  the  trace  identity  tr(AB)  =  tr(BA);  the  remaining  steps  utilize  the  linearity  of 
the  trace  function.  Note  that  is  the  covariance  matrix  of  isotropic  noise  over  /C. 


The  third  and  hnal  step  to  solving  Problem  5.3.4  determines  the  global  optimum 
using  the  Poincare  separation  theorem  [31,  §4.3].  Let  Ai(R)  represent  the  largest 
eigenvalue  of  the  nxn  Hermitian  matrix  R.  For  any  nxr  orthonormal  transformation 
matrix  P,  the  Poincare  separation  theorem  states  that 

Ai(R)  >  A,(P^RP)  >  A,_,+,(R)  *  =  1, . . . ,  r.  (5.3.9) 

The  upper  and  lower  bounds  in  Equation  5.3.9  are  achieved  simultaneously  for  all 
eigenvalues  when  P  spans  the  dominant  and  sub-dominant  subspace  of  R,  respec- 
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tively.^  One  implication  of  Equation  5.3.9  is  that 

r 

5^/[A,(P^RP)]  (5.3.10) 

i=l 

is  maximized/minimized  for  any  monotonic  function  /(A)  when  P  spans  the  dominant /sub¬ 
dominant  subspace  of  R.  Substituting  Equation  5.3.7  into  Problem  5.3.6  yields 

D 

min  5^A2(PfR^P^).  (5.3.11) 

P_LeG(Do— D,Do)  i=i 

Because  Rye  is  positive  semidehnite,  each  eigenvalue  is  nonnegative  and  /(A)  =  A^  is 
monotonic.  Using  Equation  5.3.10,  the  global  optimum  is  achieved  when  P^  spans 
the  sub-dominant  subspace  of  Rye,  or  when  P  spans  the  dominant  subspace.  The 
global  optimum  subspace  is  always  unique  when  the  eigenvalues  of  Rye  are  unique. 

The  above  derivation  is  complex,  but  obtaining  the  optimum  transformation  is 
relatively  simple.  An  orthonormal  transformation  to  the  optimum  iA-dimensional 
subspace  is  found  in  three  steps: 


^The  eigenvectors  with  the  largest  eigenvalues  span  the  dominant  subspace;  the  eigenvectors  with 
the  smallest  eigenvalues  span  the  sub-dominant  subspace. 
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Note  that  the  hrst  two  steps  do  not  depend  on  the  subspace  dimension.  Applying 
the  transformation  matrix  is  “beamforming”  with  the  eigenvectors  of  Hfc,  so  Pe  is 
commonly  referred  to  as  an  “eigenbeam”  transformation.  Often,  the  most  difficult 
step  in  computing  the  eigenbeam  transformation  is  integrating  to  form  H/c- 


5.3.3  Analysis 

The  procedure  in  Section  5.3.2  constructs  orthonormal  transformations  to  the  opti¬ 
mal  least-squares  subspaces  of  any  chosen  dimension.  Section  5.3.2  only  proves  that 
these  subspaces  minimize  the  least-squares  criterion;  it  does  not  indicate  how  much 
dimension  reduction  is  possible. 

This  subsection  analyzes  the  behavior  of  eigenbeam  subspaces  and  demonstrates  a 
signihcant  reduction  in  dimension  with  near-optimal  performance.  First,  it  character¬ 
izes  the  error  behavior  of  eigenbeam  transformations  and  provides  a  rule  for  choosing 
an  acceptable  subspace  dimension.  Second,  it  shows  that  the  required  dimension  is 
often  very  small  and  grows  linearly  with  region  size  and  frequency.  Third,  it  pro¬ 
vides  an  example  of  the  performance  improvements  achieved  with  subspace  adaptive 
processing. 

The  integrated  squared  error  from  Equation  5.3.4  provides  a  natural  metric  for 
choosing  the  dimension  of  eigenbeam  subspaces.  Just  as  the  length  of  a  hlter  is  chosen 
to  satisfy  a  maximum  error  criterion,  the  dimension  of  an  eigenbeam  subspace  is  easily 
chosen  by  the  integrated  error.  Because  the  eigenbeams  span  the  dominant  subspace 
of  Hfc,  the  integrated  error  of  a  D-dimensional  eigenbeam  subspace  is  determined  by 
the  sub-dominant  eigenvalues  of 

Do 

e{D)  =  y  A?  (Rk)  .  (5.3.12) 

Recall  that  Dq  is  the  dimension  of  the  full  space,  or  the  data  dimension.  The  function 
s{D)  is  easily  evaluated  from  the  existing  singular  value  decomposition.  A  more 
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Normalized  Dimension  (D/Dq) 


Figure  5.3.1:  Integrated  error  versus  eigenbeam  dimension 


universal  quantity  is  the  relative,  or  fractional,  error  formed  by  the  ratio 


,,(D)  A 


4D) 

£(0) 


Efip+i  Af  (R;c) 
E.5.  V  (RO 


(5.3.13) 


By  dehnition,  the  fractional  error  lies  in  the  range  0  <  r]{D)  <  1  and  is  monotonically 
decreasing  with  D.  Figure  5.3.1  provides  an  example  of  fractional  error  (in  decibels) 
versus  subspace  dimension.  The  curve  in  Figure  5.3.1  corresponds  to  an  iV  =  30 
element  VSA  at  /  =  5/7/^;  the  region  JC  corresponds  to  the  entire  visible  region  in 
M-space.  Figure  5.3.1  reveals  that  the  fractional  error  exhibits  threshold  behavior  at 
a  critical  dimension  that  is  much  less  than  Dq.  Beyond  this  critical  dimension,  r]{D) 
decreases  rapidly.  A  rule-of-thumb  for  choosing  the  subspace  dimension  is  to  select 
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Figure  5.3.2:  Loss  incurred  with  an  eigenbeam  subspace:  D/Dq  =  0.4 


the  minimum  dimension  that  reduces  the  fractional  error  below  a  given  threshold,  or 

De  =  ini  {D  \  r]{D)  <  e}  (5.3.14) 


for  some  e  1.  The  threshold  behavior  of  f]{D)  implies  that  the  particular  choice 
of  e  has  little  effect  on  the  resulting  dimension,  De-  The  remainder  of  this  section 
assumes  a  conservative  threshold  of  e  =  10“®.  Applying  this  threshold  to  the  example 
in  Figure  5.3.1  gives  D / Dq  =  0.4,  a  60%  reduction  in  dimension.  To  visualize  the 
resulting  errors,  consider  the  “subspace  loss”  of  a  manifold  vector  v  as  the  ratio 


v^PP^v 

v^v 


<  1. 


(5.3.15) 


As  its  name  implies,  the  subspace  loss  is  always  less  than  unity  when  P  is  orthonormal. 
Figure  5.3.2  confirms  that  the  subspace  loss  is  negligible  (<  0.01  dB)  for  this  example. 
The  hgure  also  reveals  that  subspace  loss  increases  near  array  endhre. 
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Eigenbeam  subspaces  yield  substantial  dimension  reduction  when  applied  to  vector¬ 
sensor  arrays.  Below  the  design  frequency,  the  required  dimension  approximately 
equals  the  number  of  critical  beams  required  to  cover  the  region.  Critical  beams  are 
conventional  beams  spaced  at  the  Rayleigh  resolution  limit,  i.e.  at  the  peak-to-null 
width.  In  M-space,  the  VSA  peak-to-null  width  is 


(AOpn  =  |:  (5.3.16) 

SO  the  number  of  critical  beams  required  to  cover  some  region  Aw  on  both  sides  of 
the  array  is 


crit 


(Am)pn 


(5.3.17) 


The  additional  factor  of  two  accounts  for  the  two  sides  of  the  array.  Figure  5.3.3 
illustrates  the  eigenbeam  dimension  versus  frequency  and  region  size  for  a  very  long 
array  [N  =  201).  The  subspace  dimension  is  well  approximated  by  the  number  of 
critical  beams,  or 


Bcrit 


1 

2 


(5.3.18) 


Equation  5.3.18  is  written  in  terms  of  normalized  quantities  to  illustrate  its  simple  bi¬ 
linear  form.  The  subspace  dimension  is  often  slightly  greater  than  the  approximation, 
depending  on  the  threshold,  but  the  approximation  becomes  tighter  as  the  number  of 
elements  increases.  Equation  5.3.18  is  not  intended  to  replace  the  numerical  method 
in  Equation  5.3.14,  only  to  illustrate  its  dependencies.  The  power  of  subspace  pro¬ 
cessing  is  evident  in  Equation  5.3.18  and  Figure  5.3.3:  compared  to  element-space, 
eigenbeam  processing  reduces  the  dimension  by  at  least  half  and  often  much  more. 
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1 


Figure  5.3.3:  Eigenbeam  dimension  versus  frequency  and  region  size 


The  dimension  reduction  achieved  with  eigenbeam  processing  allows  for  dramatic 
improvements  when  the  training  is  limited.  As  mentioned  in  Section  1.4.3,  the  number 
of  data  snapshots  is  often  too  few  for  element-space  processing.  Figure  5.3.4  illustrates 
one  such  scenario  with  an  =  30  element  vector-sensor  array  at  /  =  5/7 fd-  The  sim¬ 
ulation  involves  white  noise  and  four  sources  at  0  ~  {— 0.737r,  —  0.397r,  0.277r,  O.OItt} 
radians  with  array  signal-to-noise  ratios  ASNR  =  {6, 12, 18,  24}  dB,  respectively.  The 
two  sources  nearest  endhre  leak  into  beams  on  the  opposing  side  of  the  array;  note  the 
small  “backlobe”  peaks  near  0  ^  — 0.277r  and  0  ~  0.737r.  The  eigenbeam  subspace 
is  the  same  subspace  illustrated  in  Figures  5.3.1  and  5.3.2  with  the  same  dimension, 
D  =  48.  The  number  of  snapshots  is  77  =  3  x  D,  enough  to  guarantee  the  eigenbeam 
covariance  is  well-estimated  [9].  The  ABF  power  estimates  are  non-Gaussian  ran¬ 
dom  variables,  so  their  median  values  and  the  95%  conhdence  region  are  indicated  as 
determined  from  10,000  Monte-Carlo  trials.  The  element-space  processor  is  severely 
affected  by  the  low  sample  support,  yielding  a  large  power  bias  and  unreliable  esti- 
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Angle  (radians) 


Figure  5.3.4:  ABF  comparison  with  K  =  3  x  D  snapshots 

mates.  By  contrast,  the  eigenbeam  processor  is  near  optimum.  Eigenbeam  processing 
reduces  the  power  bias  by  ~  6  dB  and  yields  more  reliable  power  estimates.  The  com¬ 
parison  in  Figure  5.3.4  is  impossible  in  many  scenarios  because  the  training  is  too 
short  for  element-space  processing.  For  example,  Figure  1.4.3  predicts  only  K  ^  108 
available  snapshots  in  the  scenario  described  above.  Eigenbeam  adaptive  process¬ 
ing  is  well-conditioned  with  this  support,  but  element-space  adaptive  processing  is 
impossible  without  modification. 

One  beneht  of  dimension  reduction  is  an  improvement  in  output  bias,  or  a  dimin¬ 
ished  “loss.”  Sample  matrix  inverse  (SMI)  processing  estimates  a  covariance  matrix 
from  a  hnite  number  of  snapshots.  The  output  power  of  the  processor  is  biased  low 
because  the  covariance  estimate  is  imperfect  [9].  The  power  bias  (in  decibels)  for  a 
D- dimensional  problem  with  K  >  D  snapshots  is 
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Reducing  the  problem  dimension  from  Dq  to  D  improves  this  bias  by 


10  logio 


(K  +  2-D  \ 
\k  +  2-Do)  ■ 


(5.3.20) 


The  improvement  is  most  dramatic  when  the  dimension  is  significantly  reduced  and 
the  number  of  snapshots  is  limited,  as  illustrated  in  Figure  5.3.4. 

The  eigenbeam  techniques  developed  in  this  chapter  have  important  applications 
in  sector  and  local  subspace  processing.  This  chapter  primarily  analyzes  eigenbeam 
processing  over  all  of  n-space.  It  is  possible  to  segment  n-space  into  different  regions, 
or  “sectors,”  and  perform  eigenbeam  processing  within  each  sector.  It  is  also  possible 
to  process  each  beam  in  its  own  local  subspace.  Partitioning  u-space  into  multiple 
regions  produces  many  adaptive  processing  problems,  each  of  which  has  smaller  di¬ 
mension  than  the  original.  Sector  and  local  subspace  techniques  are  discussed  further 
in  [5,  §3.10]. 


5.4  Differential  Geometry  of  Subspace  Design 

The  previous  section  describes  the  performance  improvements  possible  with  subspace 
processing.  A  key  result  in  the  discussion  is  the  observation  in  Equation  5.3.18  that 
the  eigenbeam  dimension  grows  linearly  with  both  frequency  and  region  size  and 
remains  smaller  than  the  full  data  dimension.  Althongh  a  detailed  discussion  is 
beyond  the  scope  of  this  thesis,  Eqnation  5.3.18  hints  at  the  special  strnctnre  of 
the  vector-sensor  array  manifold.  Parameterized  by  azimnth  angle,  the  manifold 
v(0)  represents  a  smooth  curve  on  a  radius- \/2iV  sphere  in  the  high  dimensional 
space  Althongh  moves  aronnd  the  sphere  (as  indicated  by  the  sidelobes  in 
the  beampattern),  it  stays  almost  exclusively  within  the  low-dimensional  hyperplane 
given  by  the  eigenbeam  snbspace.  In  this  sense,  v((;/))  moves  aronnd  the  sphere  near 
the  “eqnator.”  The  curve  appears  flat  on  a  small  scale,  bnt  its  dimension  grows 
linearly  as  the  scale  increases  (see  Fignre  5.3.3).  Snbspace  design  is  closely  tied  to 
this  geometric  pictnre  and  other  concepts  from  differential  geometry. 
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Array  JNR  (dB) 

Figure  5.5.1:  Efficiency  of  an  example  subspace  processor 

5.5  Performance  of  Adaptive  Processing 

The  adaptive  processing  techniques  described  in  this  chapter  approach  the  perfor¬ 
mance  predicted  in  Chapter  3  under  many  circumstances.  Unlike  the  hxed  weights 
designed  in  Chapter  4,  adaptive  processors  perform  well  even  in  the  presence  of  strong 
interference.  Figure  5.5.1  illustrates  this  behavior  with  the  standard  iV  =  10  element 
VSA  steered  to  0o  =  — Tr/d  at  frequency  /  =  5/7 fd-  The  NRMSE  metric  and  Cramer- 
Rao  bound  are  discussed  in  Section  3.2.  The  eigenbeam  processor  displayed  in  Figure 

5.5.1  utilizes  D  =  19  dimensions  and  K  =  50  snapshots.  The  subspace  ABF  is  not 
unbiased,  but  the  Cramer-Rao  bound  is  still  helpful.  Comparing  Figures  5.5.1  and 

4.4.1  reveals  that  the  left/right  performance  of  adaptive  processing  is  substantially 
better  than  hxed  weight  processing  when  the  interference  is  strong.  Furthermore,  ad¬ 
vanced  adaptive  processors  likely  achieve  better  NRMSE  performance  than  the  simple 
SMI  processor  shown  in  the  hgnre.  Figure  5.3.4  reveals  that  adaptive  processing  also 
increases  resolution,  an  improvement  not  captured  by  the  NRMSE  metric. 
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Chapter  6 


Conclusion 


The  analyses  and  techniques  presented  in  this  thesis  enable  substantial  improvements 
in  vector-sensor  array  processing.  Building  on  the  fundamentals  enumerated  in  Chap¬ 
ter  2,  the  thesis  bounds  vector-sensor  performance  and  describes  near-optimal  pro¬ 
cessing  techniques. 

6.1  Summary  of  Processing  Improvements 

Each  chapter  of  the  thesis  focuses  on  improving  a  different  branch  of  vector-sensor 
array  processing  as  described  in  Section  1.5.  The  performance  bounds  developed  in 
Chapter  3  tie  the  processing  techniques  together  in  Figures  4.4.1  and  5.5.1.  These 
figures  quantify  the  left/right  rejection  of  both  techniques,  but  they  do  not  give  a 
sense  of  overall  performance.  Figure  6.1.1  illustrates  the  improved  VSA  processing 
achieved  by  this  work.  The  four-source  scenario  described  in  Section  5.3.3  is  simulated 
with  the  standard  iV  =  10  element  vector-sensor  array  at  /  =  5/7 fd-  The  number  of 
observations  is  K  =  50. 

The  top  axis  in  Figure  6.1.1  displays  conventional  VSA  processing.  With  conven¬ 
tional  processing,  strong  sources  on  one  side  (0  >  0)  of  the  array  have  high  sidelobes 
and  backlobes  that  interfere  with  sources  on  the  other  side  (0  <  0).  False  peaks  at 
0  Ki  — 7r/4  and  0  ~  — 57r/8  make  detecting  the  true  sources  difficult.  It  is  impossible 
to  determine  the  location  and  number  of  sources  from  conventional  processing  alone. 
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Figure  6.1.1:  Example  of  improved  VS  A  processing 
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The  center  axis  in  Figure  6.1.1  displays  the  output  of  a  beamformer  using  the 
minimum  sensitivity  hxed  weights  described  in  Section  4.3.3.  The  weights  have  a 
maximum  sidelobe  level  of  —25  dB.  Minimum  sensitivity  weights  reject  sidelobe  in¬ 
terference  without  sacrihcing  much  resolution,  so  the  four  true  sources  are  clearly 
visible  without  strong  false  peaks.  The  left/right  resolution  is  reasonably  close  to  the 
optimum  result.  The  limits  of  fixed  weights  are  also  illustrated:  weak  sidelobes  are 
visible  on  the  strongest  source  and  the  array  resolution  is  far  from  optimal. 

The  bottom  axis  in  Figure  6.1.1  displays  the  eigenbeam  adaptive  processing  de¬ 
scribed  in  Chapter  5.  Adaptive  processing  in  the  low-dimensional  {D  =  19)  subspace 
yields  fast  convergence  and  near-optimum  results.  Similar  processing  in  element-space 
produces  a  biased  and  unreliable  output.  Eigenbeam  adaptive  processing  reduces  the 
left/right  ambiguities  more  than  hxed  weight  processing,  but  the  most  signihcant 
improvement  is  the  increased  resolution. 


6.2  Future  Work  in  VSA  Processing 

The  last  chapter  of  this  thesis  is  not  the  hnal  chapter  in  vector-sensor  array  research. 
The  doors  opened  by  this  work  lead  to  many  unexplored  and  interesting  areas  within 
array  processing: 

•  Extending  the  convex  optimization  algorithms  described  in  Chapter  4  to  arbi¬ 
trary  arrays  and  stochastic  models.  A  generalized  beampattern  design  algorithm 
could  be  easily  constructed  around  a  second-order  cone  solver  (see  Section  4.3.4). 
Such  an  algorithm  would  be  a  powerful  tool  for  arrays  that  are  nonlinear,  mis¬ 
matched,  or  both. 

•  Deriving  optimal  “local”  subspaces  for  each  beam.  The  eigenbeam  subspaces  in 
Chapter  5  support  processing  of  signals  and  interference  within  a  given  region. 
One  alternative  problem  is  designing  a  subspace  for  each  beam.  A  weighted  least 
squares  approach  yields  a  modihed  eigenbeam  technique.  The  dimension  of  such 
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local  subspaces  is  likely  to  be  small  and  may  only  depend  weakly  on  the  array 
length,  but  computational  load  and  processing  artifacts  may  be  problematic. 

A  number  of  interesting  topics  specihc  to  vector-sensor  arrays  also  arise: 

•  Extending  this  work  to  resolve  pressure-sensor  grating  lobes.  Directional  in¬ 
formation  allows  for  unambiguous  vector-sensors  array  processing  at  all  angles 
and  frequencies.  This  work  focused  on  resolving  left/right  pressure  ambigui¬ 
ties.  The  same  techniques  may  resolve  pressure-sensor  spatial  aliasing  above 
the  array  design  frequency  (see  [5,  §2.4]  and  [1]). 

•  Matched  field  processing  with  vector-sensor  arrays.  The  directional  measure¬ 
ments  provided  by  a  horizontal  vector-sensor  array  allow  limited  vertical  reso¬ 
lution.  Leveraging  this  vertical  resolution  could  reduce  the  problematic  ambi¬ 
guities  in  matched  held  processing  (see  [21]). 

•  Computationally  efficient  adaptive  processing.  The  point  null  approach  in  Sec¬ 
tions  4.1  and  4.2  is  easily  transformed  into  an  adaptive  sidelobe  canceller. 
Adapting  only  to  the  backlobe  and  grating  lobes  requires  little  computation 
but  provides  less  beneht  than  fully-adaptive  beamforming. 

The  techniques  developed  in  this  thesis  provide  a  foundation  for  further  research  and 
indicate  the  bright  future  ahead  for  acoustic  vector-sensor  arrays. 
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Appendix  A 


Supplemental  Material 


A.l  Symmetric  Noise  Distributions 


This  appendix  briefly  proves  a  statement  made  in  Section  3.1:  for  “left/right  sym¬ 
metric”  noise  distributions,  the  probability  of  error  is  a  function  of  only  the  number 
of  snapshots  and  the  K-L  divergence.  The  formal  definition  of  “left/right  symmetric” 
is  that 

=  vf  (A.1.1) 

for  all  azimuth  angles.  Recall  that  vq  and  vi  are  replica  vectors  for  opposing  sides 
of  a  linear  vector-sensor  array.  This  definition  agrees  with  intuition  and  implies  the 
only  condition  necessary  for  this  proof: 

\Rn  +  g^ViVf  I 

|R„  -F 

|R„(I  +  a^R-Vivf)| 

|R„(I  +  a2R-ivoVo^)| 

1  +  gVf  R-^Vi 

1  +  a^v^R-^Vo 

1.  (A.1.2) 


RiRq 


137 


For  any  noise  distribution,  Section  3.1  shows  that  the  probability  of  error  only  depends 
on  the  eigenvalues  of  the  matrix  R^Rq  ^  and  the  number  of  snapshots  K.  Applying 
the  matrix  inversion  lemma  gives 

RiRg^  =  (R„  +  o-Vivf)(R„  + 

=  (R„  +  )(R-^  -  aR~VoV^R-i) 

=  I  -  avoV^R;;^  +  aVivf  R“^  -  a‘^a/3v^v^R-^  (A.1.3) 

where 

«  ^  (a-^  +  Vo^R-Vo)-'  (A.1.4) 

/3  =  vfR-^Vg.  (A.1.5) 

The  matrix  in  Equation  A.1.3  can  be  written  as  the  identity  matrix  plus  a  matrix 
with  rank  no  greater  than  two.  Thus,  it  has  no  more  than  two  non-unity  eigenvalues. 
Because  the  determinant  is  one,  either  the  two  non-unity  eigenvalues  are  a  reciprocal 
pair  or  all  eigenvalues  are  unity.  In  either  case,  the  relationship  between  the  two 
(possibly)  non-unity  eigenvalues  means  that  the  trace  of  the  matrix  Rj^Rq  ^  fully 
specihes  its  eigenvalues.  Therefore,  the  trace  together  with  the  number  of  snapshots 
specihes  the  probability  of  error.  Recall  that  the  K-L  divergence  between  to  zero-mean 
Wishart  distributions  is 

D(po|bi)  =  ^  In  ^  +  tr  (Rb^Ro)  -  4A^  .  (A.1.6) 

/  L  I  -*^0 1 

The  determinants  are  equal  for  left/right  symmetric  noise  distributions,  so  the  K-L 
divergence  with  the  number  of  snapshots  also  characterizes  the  probability  of  error. 

Although  not  required  for  the  proof,  notice  that  the  K-L  divergence  has  a  simple 
form  similar  to  Equation  3.1.29  for  any  left/right  symmetric  noise  distribution.  This 
is  shown  by  applying  the  matrix  inversion  lemma  to  A.1.6,  expanding  terms,  and 
evaluating  the  trace. 


138 


A. 2  Weights  and  the  Array  Manifold 


Array  processing  often  deals  with  array  manifolds  exhibiting  a  given  property.  That  is, 
every  replica  vector  v(©)  that  forms  the  array  manifold  exhibits  the  same  property. 
It  seems  natural  that  any  weight  w  applied  to  the  manifold  might  inherit  such  a 
property,  e.g. 

PI  Conjugate  symmetric  weights  may  be  sufficient  for  a  conjugate  symmetric  mani¬ 
fold. 

P2  Weights  with  element-wise  linear  phase  may  be  sufficient  for  a  manifold  whose 
replicas  have  element-wise  linear  phase. 

As  trivial  and  intuitive  as  these  assumptions  seem,  they  are  difficult  to  prove  and 
some  are  false.  An  incorrect  assumption  is  overly  restrictive  and  may  lead  to  sub- 
optimal  weights.  This  appendix  proves  that  PI  is  generally  true  and  suggests  that 
P2  is  generally  not.  More  importantly,  it  provides  weak  sufficient  conditions  for  any 
property  to  be  transferred  from  the  array  manifold  to  the  weights. 

Throughout  the  discussion,  “P”  denotes  the  property  or  the  set  of  vectors  exhibit¬ 
ing  the  property.  The  following  set  of  conditions  guarantees  that  a  property,  P,  of 
the  array  manifold  is  transferred  a  weight: 

Cl  The  weight  is  a  solution  to  a  convex  optimization  problem. 

C2  The  gradients  of  all  objective  and  constraint  functions  exhibit  P  at  any  point  ex¬ 
hibiting  P.  Formally,  for  any  gradient  or  constraint  function  /(w),  this  requires 
[Vw«/(w)]wo  e  P  VwoGP. 

C3  The  property  is  preserved  under  real,  linear  combination.  Formally,  if  xi  G  P 
and  X2  G  P  then  axi  -|-  6x2  G  P  for  all  real  a  and  b. 

Note  that  the  hrst  condition  applies  only  to  the  problem,  the  second  applies  to  both 
the  problem  and  the  property,  and  the  third  applies  only  to  the  property. 

The  above  conditions  are  deemed  “weak”  because  none  is  overly  restrictive  in 
practice.  The  hrst  condition  applies  to  the  most  useful  set  of  problems,  those  for  which 
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optimality  is  easily  proved  and  local  extrema  do  not  exist.  The  third  condition,  C3,  is 
weak  because  any  property  not  satisfying  C3  forms  a  concave  set.  For  such  properties, 
the  original  (convex)  optimization  problem  under  the  additional  constraint  w  G  P 
is  non-convex  and  likely  more  difficult.  Thus,  although  the  solution  may  exhibit  P, 
this  information  is  not  necessarily  helpful.  The  second  condition  is  weak  because 
it  is  satished  by  many  problems  and  properties  of  interest  when  C3  is  satished. 
This  includes  the  beampattern  synthesis,  optimum  beamforming,  and  weight  design 
problems  studied  in  this  thesis. 

The  proof  consists  of  showing  that  there  exists  a  sequence  of  weights  {wq,  wi, . . .}, 
w„  G  P,  that  converges  to  a  global  optimum.  Under  the  conditions  Cl— C3  above, 
the  proof  is  trivial  thanks  to  the  convergence  of  various  first-order  optimization  algo¬ 
rithms.  In  solving  the  canonical  convex  optimization  problem 


minimize  /(w) 

subject  to  5'n(w)  <  0  n  =  l,2,  ...,iV  >  (A. 2.1) 

hm(w)  =  0  m  =  1,  2, . . . ,  M 


a  number  of  hrst-order  optimization  algorithms  with  proven  convergence  iterate  by 
taking  steps  in  the  direction  opposite  a  generalized  gradient.  These  algorithms  include 
first-order  variants  of  non-differentiable  exact  penalty  and  Lagrangian  methods.  In 
this  case,  the  set  of  generalized  gradients  is  contained  in  the  convex  hull  formed 
from  the  individual  gradients  of  the  objective  and  constraint  functions: 


Vw«  C 


I  +  J2n=l  anVwif^n(w)  + 

^  +  X]m=l  ~  ^  0)  >  0,  /3m  >  o|  . 


(A.2.2) 


The  conditions  C2  and  C3  imply  that  the  step  direction  exhibits  P  because  it  is  a 
real,  linear  combination  of  objective  and  constraint  gradients  exhibiting  P.  For  the 
algorithms  mentioned,  condition  Cl  guarantees  convergence  starting  from  any  point. 
Choosing  a  starting  point  wq  G  P  without  loss  of  generality,  condition  C3  guarantees 
that  a  step  along  any  generalized  gradient  also  satishes  P.  Thus,  every  weight  in  the 
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sequence  of  iterates  satisfies  P.  Because  the  sequence  converges  to  a  global  optimum, 
that  global  optimum  must  satisfy  P. 

The  above  proof  deserves  several  comments.  First,  although  the  proof  involves 
hrst-order  algorithms  moving  along  generalized  gradients,  this  is  only  to  show  that 
a  global  optimum  exists  exhibiting  P;  it  does  not  constrain  the  type  of  algorithm 
used  in  practice.  Second,  if  the  problem  is  strictly  convex  (Cl  is  strengthened),  the 
unique  optimum  is  proved  to  exhibit  P.  Third,  if  the  problem  is  non-convex  (Cl  is 
weakened),  there  exist  local  optima  exhibiting  P  to  which  the  first-order  algorithms 
above  will  converge. 

Testing  the  two  properties  in  the  hrst  paragraph,  PI  and  P2,  for  the  conditions 
reveals  that  PI  satishes  C3  but  P2  does  not.  Thus,  PI  is  generally  true  but  P2 
need  not  be. 
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Appendix  B 


Nomenclature 

B.l  Acronyms 


Acronym 

Description 

ABF 

Adaptive  Beamforming 

ASNR 

Array  Signal  to  Noise  Ratio 

CBF 

Conventional  Beamforming 

CRB 

Cramer-Rao  Bound 

DOA 

Direction  of  Arrival 

FIR 

Finite  Impulse  Response 

JNR 

Jammer  to  Noise  Ratio 

LP 

Linear  Program 

MSE 

Mean  Squared  Error 

MVDR 

Minimum  Variance  Distortionless  Response 

NRMSE 

Normalized  Root  Mean  Squared  Error 

PCML 

Physically  Constrained  Maximum  Likelihood 

PSA 

Pressure-sensor  Array 

RMSE 

Root  Mean  Squared  Error 

SINR 

Signal  to  Interference-Plus-Noise  Ratio 

SMI 

Sample  Matrix  Inverse 

SNR 

Signal  to  Noise  Ratio 

VSA 

Vector-sensor  Array 
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B.2  Notation 


Notation 

a 

a 

a^ 

a* 

T 

a 

a  0  b 
a  0  b 

[a  ,  b]  or  [a 

D(Polbi) 

A  0  0 

[Mij 
E{a;} 
a  =  0 
or 

G{D,Do) 
(a,  b) 

A(R) 

tr(A) 

|A| 

diag(A) 

a 

b 


Description 

Example 

Scalar  variable 

Eqn.  3.1.31 

Vector  variable 

Eqn.  1.7.1 

Conjugate  (or  Hermitian)  transpose 

Eqn.  2.1.1 

Conjugation 

Eqn.  3.2.16 

Transpose 

Eqn.  1.7.1 

Element-wise  (or  Hadamard)  product 

Eqn.  2.4.11 

Tensor  (or  Kronecker)  product 

Eqn.  4.2.7 

Horizontal  concatenation 

Eqns.  1.7.1  or  3.1.2 

Kullback-Leibler  divergence 

Eqn.  3.1.24 

Matrix  A  is  positive  semidefinite 

Eqn.  3.2.1 

i,j^^  element  of  the  matrix  A 

Eqn.  3.2.2 

Expectation  of  random  variable  x 

Eqn.  4.4.3 

Definition  of  a 

Eqn.  5.3.1 

Real  or  complex  A-dimensional  space 

Sec.  5.1 

Grassmann  manifold 

Sec.  5.1 

Inner  product 

Eqn.  5.2.1 

largest  eigenvalue  of  R 

Eqn.  5.3.9 

Trace  of  the  matrix  A 

Eqn.  5.3.7 

Determinant  of  the  matrix  A 

Eqn.  3.1.6 

derivative  of  f{x) 

Eqn.  2.4.15 

Main  diagonal  of  A 

Eqn.  3.2.17 

Vertical  concatenation 

Eqn.  1.7.8 
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